Limitations of automated testing
From SiteRay wiki
There are limitations to what can be accomplished via automated website testing. Understanding these can help you make better use of SiteRay and other tools.
Contents |
Computer's can't solve open-ended problems
Currently computer software is not intelligent, and can only follow rules that have been defined for it. This means that the computer is only as accurate as the rules it is given to follow.
Most of the problems faced by SiteRay are open-ended, i.e. it is impossible or impractical to define a set of rules which would perfectly test for every given set of criteria. As a simple example, consider spell checking. A naive computer program might look like this:
- Find all words on a page
- Check them in a dictionary
- Mark words which are not found as mis-spelt
In reality, although this appears 'correct', it is impossibly impractical. Any of the following, for instance, don't appear in a standard English dictionary:
1,234,567 - a random number 02/07/1979 - a date 385th - an ordinal www.example.com - a web address example.com - a web address, or perhaps two words with no space around the period NATO - an acronym, immediately apparent due to capitalisation
Adding these words to a dictionary is possible, but impractical. Adding every possible number, date and ordinal for instance, introduces an infinite set of entries to our dictionary. So instead we add rules that detect these ad-hoc anomalies, and ignore them.
Other words have subjectively correct spelling, depending on case and context in a sentence. E.g.
I said hello is correct i said hello is not Hello David is correct Hello david is not
To confuse matters further, it is possible to intentionally mis-spell something, in a way only a human would realise:
You should always write "London" not "london".
And the mixture of languages means some words can be spelt correctly in one, but not another - and even mixed between sentences:
He was well known for his joie de vivre - correct le cat sat on de mat - garbage mix of English, French, German
In fact, this apparently simple problem soon requires an almost infinite set of rules to implement, is different between languages, and always changing.
Yet the spell checker is not a redundant or impossible piece of technology, merely an inherently flawed one. The same difficulties exist with any major spell-checking application: computers cannot perform flawless checking against an infinite rule set. They can however get close enough to be very useful: pointing out possible mis-spellings, and giving human beings the ability to apply context themselves on top - for example, adding a word to a dictionary.
Similar problems are exhibited by GPS devices, facial recognition software and fingerprint scanners.
Software can never be completely tested
SiteRay attempts to simulate the behaviour of a webpage to analyse it. Where the page includes a programming language (e.g. JavaScript) however, a complete test often becomes impossible.
As a simple example, consider a page with JavaScript which does the following:
- Check the day of the week
- If the day is Friday, go to the
thank-god-its-friday.html - Otherwise go to
just-another-day.html
For SiteRay to simulate this code, it would need to identify the variables which have an effect on the outcome (here, the day of the week) and vary them, effectively creating two versions of this program. The JavaScript would then be executed for each possible outcome. Even for this trivial example, the code required to perform this task requires both a complete execution of the program and the ability to deduce input variables from that code.
A more complex example would break this model further:
- Ask the user for their birthday
- Go to
date.html, wheredateis their date of birth expresseddd/mm/yy
In this example, SiteRay would need to understand the input parameters, their boundary conditions (e.g. no more than 31 days a month) and convert this into a series of potential permutations, each resulting in a new page. If any of this was incorrect then the permutations returned would be incorrect.
With real software, it is common to encounter programs with infinite inputs, outcomes and execution paths (Google Maps, for instance). Variables can alter during execution and include mouse position, button presses and complex interactions.
SiteRay therefore adopts a simpler model which is nevertheless effective in the majority of instances. Given our first example, SiteRay simply checks for outputs it understands ("go to page..."), and assumes they are all possible - it can't know for sure. This would leave us with:
-
thank-god-its-friday.html -
just-another-day.html
Which is correct. For the second example, it would simply fail to find anything, as the sequence is indeterminate without complete analysis. For the vast majority of cases this is sufficient.
Simulation of browsers is approximate
SiteRay simulates web browser behaviour to test pages. Because web browsers are immensely complex and their behaviour varies, it is not possible or practical to simulate all browsers perfectly. In fact, one of the greatest challenges of modern web browsers is getting them to behave consistently themselves.
Accordingly there will be small variations between the behaviour of SiteRay, Internet Explorer, Firefox, Safari, Opera and other browsers which will shift over time. In the case of SiteRay, because we don't aim to render pages like browsers, our simulation can afford some degree of laxity. Rare subtleties in CSS parsing and browser-specific hacks can create problems however.
As a general rule, SiteRay aims to simulate a standards-compliant browser (specifically, Firefox) with some additional behaviour to address Internet Explorer specific hacks.
