Comments from Expert Participants on the Preliminary Turing Test

Score: 12

I looked for the classes that varied most greatly in the map, spotting the ones that have the most ``natural'' shapes. Criteria for natural was determined by the location of the class compared to the the ``stable'' classes.

The use of a Turing Test is an interesting way to deterine whether a map looked realistic. However, I think the map *being* more realistic would be more important than a map *looking* more realistic.

Score: 9

The fractal landscape was obviously based upon the real one, in each case (similarity of certain patterns between each pair). However, it was quite hard to tell which was which, especially without any feedback on whether my judgement was correct in each case! I looked for ``natural'' patterns such as river courses or ranges of hills (I noticed some similarities in a couple to the Smoky Mountains/Appalachian chain pattern of parallel ridges and valleys) and classes grading into one another. I also looked for ``unnatural'' patterns like large fairly regular blocks of single colours.

Hope this is useful. I like this video game!

Score: 9

I tried to distinguish the landscapes by pattern recognition. I selected based on which image looked more like ``real'' land classifications (i.e., streams and vegetation coverage patterns, urban growth patterns, etc.)

This was really neat! I'd bet there are several GIS experts you could fool with this!

Score: 13

I looked for `flow' between patterns. In my experience, landscape patterns are generally not disjointed, patterns of similar type are linked together. I generally IDd the maps to be real if I perceived a higher degree of linkage between similar patterns relative to the other map in the pair. This was very difficult for some of the maps that complex patterns of shading.

As for comments, I found this to be very interesting, easy to follow, and informative. Let me know how it comes out. Also, if I can be of further help, let me know. BTW, I am wondering what the general rate of correct classification is for all participants.

Score: 15

I based my answers on the assumption that digitally created landscapes would have more detail than actual maps. I tried to choose the map that had less definition. It was very difficult to choose with some of the pictures. The fractal system is very close to actual maps. There may have been other ways to determine the actual maps, but I do not have the specific training to distinguish the more technical features.

Nice job!....Where's my banana?

Score: 12

Mainly I picked maps that showed an integration of colors- not just large blocks of color. The maps were so much the same I am not sure I see the point in asking us to ``guess'' what is real and what is not without any idea of what the maps are trying to show. This seemed more like a probability test than reading a map. Thanks for the opportunity to participate.

Score: 13

I searched for continuity of dendritic patterns and linear features on the assumption that these represented stream patterns and linearities such as ridges and valleys, respectively. I also looked for simpler patterns, i.e. I tended to discount maps that included complex or speckled patterns in favor of those with less complex, more continuous patterns.

I would comment that not knowing the scale of the maps made the choices difficult and often arbitrary. I.e., in many if not most cases, there was no apparent rational basis upon which to make a choice.

You appear to have met your objective. However, I would argue with Turing's definition of intelligence. A fool is often convincing.

Score: 15

linear features are probably harder to ``fake'' than more circular ``blobs''.

Score: 11

color following contours of linear items (e.g. streams)
not too much fine-grained fractal-looking pattern
anisotropy present
variation from fine to chunky-looking within one plot

neat, it looks like some features were fixed in both?

Score: 11

I tried to find maps with a ``fractal look'' over the whole area, but I was not different than random selection! Smart development!

Score: 6

The simulated maps were reasonable representations of spatial arrangement of a landscape. However, the members of each pair definately differed from each other in pattern. So if the question is did the generator produced a simulated landscape that looked real? Yes it did. If the question is did the generator produce a a realistic simulated landscape that reflected the spatial arrangement of the the real landscape from which the spatial constraints were derived? The answer was not good/not bad.
Selection was simple. I attempted to not place preconceived notions as to what the real landscape was. For the most part my choices were random. A small percentage seemed unreasonable to me.
Most of the time I scanned each. If they were both reasonable, I just selected one I was in the mood to select. If the map had some ``illogical'' speckling pattern then I would select the other.

As a generator of neutral models, the generator seems very good. As a replicator of spatial pattern while maintaining contraints from the real map, I have mixed reviews (not bad, but not excellent either). Overall: Very interesting.

Score: 14

looked for layers in the patterns; all landscapes have some features that show drainage or geologic changes; looked for patterns in the data that are typical of processing artifacts

Score: 10

Distinguishing landscapes:

Identify dendritic stream networks
patterns/similarity in spatial relationships between land cover types (spatial autocorrelation) throughout map
aggregation of land cover types
looked for patterns I generally see in land use maps (recently,most often using USGS LUDA maps)

Numbers 3 and 4 above really influenced me in relation to scale. Information on scale/resolution beforehand probably would have influenced my answer to a great degree. I was second guessing resolution/scale in trying to determine the degree of 'contagion'.

Score: 12

I looked for non-natural looking features. Straight lines, large homogeneous chunks. It looks like you held one land use type constant for both maps. Is this true, if so you should tell us up front.

Score: 12

I tried to look at edges between colors and some simply looked more ``map-like''. It was tough though!

Score: 7

I couldn't tell the difference in the pairs with regard to accuracy That could very likely be from my lack of expertise rather than the quality of the images.

Score: 7

Level of detail and patterns (natural vs ``synthetic''). I trust you are to share your overall results as I would be very interested in them.

Score: 15

I tried looking for any indication of peak and valley lines. Does the speed and order in which the choices come up have anything to do with the ``on the fly'' generation of the landscapes Excellent job, I look forward to reading about your overall results.

Score: 8

Realistic appearance, detail (fractal simulation may have more detail)

Score: 7

I tried to distinguish mostly by the drainage patterns - very interesting test!

Score: 12

I have some knowledge of fractal images from applied mathematics and I tried to find fractal-like clusters in the simulated landscapes. Areas with strange borders and multiple overlapping of classes I thought were not real. Since I only got 12 right - what do I know!

Score: 5

Since I only chose 5 as real which were correct, I'd say the generated maps fooled me pretty much. Good luck with the project.

Score: 14

I tended to choose against landscape where there were many single isolated pixels or a horizontal or perpendicular element. At least one of the maps reminded me of the ridge and valley province. I am impressed. I am also amazed that I did better than 50%! I will be curious to see the results of your survey. Thanks for asking me.

Score: 13

without knowing the process generating the maps, any map could be the real map.

Score: 8

I think I was mislead in thinking the real maps were classified imagery. I can see that the clustering technique could introduce some weird patterns that I thought were from the Fractal Realizer. That is, combining temp, prec. soil, etc. from different sources - I'm amazed that the real maps looked so good. Actually, both maps of a pair looked good, it was a fun test!

Score: 8

Overall Patterns; blockiness; too much sprinkling of colors

Score: 4

The algorithm used to generate the simulated maps seems to be very sensitive to the ``constraints'' of the real maps. If this is true, can the simulated maps really be considered null maps. Or rather, are they simply a copy of the original map with some stochastic variation thrown in??

Nevertheless, good job. I enjoyed the challenge. I would like to 'challenge' one of my colleagues here, but he is color-blind and would not fair well with the earth tones.

Score: 10

Very clever test...I got 50%, hmmmmm....

I tried to look for drainage features and then see how the patterning developed away from them. I also looked for features that I thought might have been human-use.

One thing that I thought was deceiving was that certain features, those displayed in white, were virtually the same on both maps. I had assumed at the onset that the ``fake'' map was generated in its entirety by the computer, but in fact there were common elements in both maps. Perhaps I missed something in your introductory discussion.

Score: 4

Drainage and terrain related patterns

Score: 8

It was difficult to see all the colors (9) in the images especially the yellows. I have very little experience with fractals and was pretty clueless here! I based my decisions on pixel distribution.

Score: 5

Distinguish: Realism (?) I kept thinking that neither one was 'real'. Nice app, but ditch the blinking 'real'. Try red font. Can you provide some background music? Jeopardy? Compare accuracy of first 7 vs last 7.

Score: 13

I tried to observe edge characteristics and look for realistic patterns.

Score: 9

I *thought* the fractal-generated landscapes exhibited greater contagion and more of a gradient across the map. Clearly this was not the case since my success at identifying the real landscapes was <50%! How were the fractal landscapes generated with respect to the real landscapes? Were the fractal simulations based on the real landscape maps or generated with a purely fractal algorithm?

Score: 10

I observed that the difference between maps was based on clumping of similar colored features with some features being preserved between pairs. I tended to then choose those maps where features were more dispersed and where dendritic patterns remained consistent. The map you used was an ecoregion map which is confined in its viewing interest to environmental scientists and ecologists. Ecoregions are hardly representative of a real physical landscape which is visually constrained by geomorphology, so I think this test somewhat biased by familiartity of the landscape to a small specialist group. Secondly, I think it would have been nice to see which ``guesses were correct and vice versa''. Overall, though I like your presentation of the test, very great, though I think the basemap for comparison was the wrong choice.

Score: 11

generally look for features which resemble streams - not knowing what the classes were, of course, this was just a guess

Score: 12

(1) The sharp boundary reflects the real.

(2) The more self-similar patterns reflects simulated landscape

Score: 15

Topographic patterns. Rivers, ridge tops, etc.

Since the mouse is on the right, when I could not make up my mind I felt like clicking on the first available map. Interesting to see if there is a bias.

Score: 11

I tried to distinguish landscapes based on surface/near surface moisture patterns such as drainage, soil moisture, organic matter patterns. I noticed that I could try to distinguish some landscapes by 1) the pattern of association between simmilar colors and what I perceived as surface moisture patterns; 2) I worked off the assumption that the realizer maps would have slightly less clustering of categories because it may select places with equal likelyhood to have the category, but that this may not be ``naturally'' consistant with associations as a I perceive what they should be. Even so -- I only got 55% right, so you guys did an excellent job! Now, if I knew the scale before hand..... how would the test turn out? How about if I knew the categories with a legend? If I did no better with distinguishing landscapes with this knowledge, then the realizer would have passed a more rigorous test, I believe.

Score: 10

Looked for underlying topography and gradients

Score: 10

Associated broken dendritic, ``topo shadow'' and linear features with real landscapes. Apparently a coin would have done me as much good!

Neat! When might this be available for public use?

Are you planning to see if the conventional landscape indices can distinquish real from Realizer?

Score: 8

Intuitively. I looked at the medium to large scales and tried to decide whether the overall relationships were ``right''.

Score: 16

I think the simulated scapes had too much ``salt and pepper''. Real landscapes usually have high autocrrelation over short scales, leading to compact clusters.

Score: 11

I tried to factor level of complexity, whether or not some patterns seemed illogical. Also, it appeared as though the real map may have been getting loaded first or faster than the fake map. you may want to check on that. did you say FUN? this was positively demoralizing! i got to where i was practically flipping a coin. and.. clearly, i did no better than that!

Score: 11

I tried to distinguish the maps just with my familiarity with topographic, bathymetric and fractally generated maps and the sorts of patterns that you may see in the different types. I can see the need for this test so that you can be rigorous, but after looking at the first couple pairs I was quite convinced that the synthetic maps and the maps from real data were quite similar.

Score: 6

I looked for the speckling that is characteristic of most remotely sensed imagery. I also tried to compare the patterns that were common to both maps against the patterns shown in the other colors.

Score: 4

I thought the simulated landscapes would be too ``perfect'' and the real landscapes would have more of a ``salt-and-pepper'' effect as if a satellite image had been classified. Obviously I was wrong!

Score: 10

I tried to look for pattern on the landscape, and to see if the simulated pattern flowed as it would in an actual landscape.

Score: 12

Intuitive. Pattern and connectivity. Wild guess.

Score: 12

I looked for a 1-pixel ``freckling'' effect for the simulated landscapes or for what I thought a lot of fragmentation of a particular cluster type. Also, some of the clusters looked too square or edges too straight for too long a distance, but don't know if that was right or not.

Score: 10

by the degree of self-similarity

Score: 16

bill, nice work! a couple of decision heuristics i used were lacunarity & anisotropy. i was suspicious when

(1) class shape was nearly isotropic;

(2) class was overly dispersed (a low lacunarity at large sampling extent);

(3) classes overlapped in abrupt ways; and

(4) when apparent cross-class anistropies were evident in one but not the other (if they were evident i chose that as the real map).

finally, it would be nice to see where we get fooled, using thumbnails perhaps, and it would be interesting to look at whether there is a learning curve for correct hits.

Score: 13

I don't know how I did it! I was probably avoiding what I thought was excessive fragmentation, and was looking at the boundaries between classes to guess of they looked ecologically consistent.

Given no information except pattern, I am wondering exactly what the test is testing in terms of landscape ecology. The term ``real'' implies that a pattern or set of patterns can be distinguished as ``unreal'' solely on the basis of pattern. I think this is not likely. It might be instructive to identify the characteristics of maps that people think are clearly ``unreal''.

The realizer certainly generates great landscape patterns. Is this telling you that landscapes really do have a significant fractal component? I think we knew that, and maybe you are showing that computer generated fractal landscapes are hard to tell from real fractal landscapes.

Score: 8

If after each selection I was told which was the real landscape, then maybe my ability to find the real landscape would improve.

Score: 14

At first I tried to find fine details on the maps and try to figure out what the patterns might represent, streams, etc. Then I became fustrated and looked at the maps holistically. I have looked at a lot of aerial photos, so I tried to find the one that most resembled a real scene from an airplane.

Score: 12

Imagined topography - looked for continuity of drainages, ridge orientations, etc. If pattern did not look like it was primarily driven by topgraphy, I did not choose it.

Score: 4

I thought I might be choosing the simulated landscapes rather that the ``real'' landscapes based on the degree of fragmentation etc. The graininess of the two comparisons was usually different enough that I could tell one was computer generated. In a very real sense, however, neither landscape is real. What you call ``real'' are, in fact, simplifications of real landscapes. Thus, I think this is not a fair comparison to the true Turing test. It is as if Turing were to test two computers for their degree of human intellegence, rather than the computer vs the true human.

So, is 4/20 significant? I'll have to compute the binomial, but I think so.

Score: 15

About half way through the test I realized that I had a strategy for selecting what I thought was the correct map. In a couple of cases I felt the choice was obvious, in the others my gut feeling was to always chose the map with the least complicated patterning. Some of the map patterns seemed unrealistically complex to me. Overall, I feel you did a wonderful job simulating pattern!

Score: 10

Basically I randomly selected one. Actually the result already showed that, 50%.

Before I know more about the ``driving engine'' of the Realizer, it's difficult for to say anything at this point. I assume you will be the landscape ecology symp. We can talk about it there.

Score: 12

I oscillated among choosing on what appeared to be structural/substrate effects, erosional patterns and altitude/drainage differences, all of which may produce different patterns. By frame 4 I concluded that the patterns were somewhere in the Northeast US. They were not Southwestern or Northwestern--for what that's worth as a right brain conclusion. Doing good!

Score: 5

patterns, thinking about riparian networks Definitely had vegetation in mind - fun test!

Score: 15

I distinguished mostly on juxtaposition of types. Randomization often leads to many more adjacent habitat types than in the natural landscape (because you're not modeling the underlying processes such as erosion, etc.) Also, sythetic fractals are typically rougher at the finest scales than patches derived from data/imagery because imaging and data collection tend to smooth very fine details.

Score: 11

I'm not convinced that this is a fair assessment. Unless you know what kind of map you are looking at you can't really judge the patterns. This map was apparently a composite map, not simply temperature, or elevation, or soil type. Composite maps tend to look non-intuitive to begin with. For example, one of the things I looked for was smooth transitions between contours. This immediately led to two problems: (1) you don't give the color progression, so I couldn't tell if two adjacent colors were gradational or a jump; (2) contouring mixed quantities can lead to legitimate jumps. If I'd know I was looking at a contour map of a mixed quantity, I'd have selected differently. If you really want to test your simulator's ability, try picking a single value, like elevation, or temperature; tell the testor what is being contoured and what the color scale is; and then run the test. This test is like trying to distinguish a human voice from a computer voice, but with 500 voices talking at once in both cases.

Score: 8

Based on working with rescaled (aggregated) maps of a variety of features, I basically tried to look for some of the ``ugliness'' (blocky,``unrealistic!'') patterns that often come from working with real maps at a coarser scale than than that of a ``pretty'', smooth, fine-scale picture of the features. Guess that didn't work!!

Score: 16

Tried to see the ``fractal looking'' patterns at the edge of each color. Considering i am not a map observation expert it seems to me it was pretty obvious in most cases. I don't think I was lucky :-)

Score: 12

Degree of small scale complexity

Score: 17

Very interesting test. Knowing the type of map used would be a real key to estimating which one was real (e.g., man-made soil map versus hydrologic map versus aerial photo versus contour map versus country boundary map, etc.) The user is left to guess in this test, which makes it more challenging. Also, someone who had seen fractal displays may have an advantage over someone who has not. In general, the fractal patterned maps seem to be more complex and less homogeneous with little bits and pieces left around, and perhaps don't follow the natural lineations quite as well.

Score: 12

Mostly I keyed on what seemed to be erosional features associated with bodies of water. Occasionally I rejected frames that had an excessive number of isolated single squares.

I noticed that many of the pairs had duplicated complex patterns, clearly non-random (as though some shapes had been reproduced on the generated figure from a template). I was not expecting this.

Good luck!

Score: 6

Without knowing what the scale of the ``landscape'' maps, (they actually covered much more area than I would consider to be landscape scale), its impossible to settle on a frame of reference from which to make comparisons. A ``real'' looking landscape at one scale may look totally bogus at another. I also did not know how the images related to each other. Sometimes there were patterns that matched exactly across both maps, other times they looked completely different. I'm kind of lost as to what you gained from this, but you may be sitting back chuckling at my comments and thinking ``boy, we sure fooled this guy!''

Score: 10

In my email to you just prior to taking this test, I predicted a 50% success rate (which was obtained) on the basis that unless at least one of the observers is human the turing test does not apply.

Score: 11

I guessed basically - I couldn't tell the difference. I tried to guess at what may have been generated by fractals.

Score: 10

I tried to establish which colors represented higher and lower elevations. To make the simulation more realistic, you should probably include a key which provides a relative ranking of elevation with color. Without this key, there is not much to go on.

Score: 10

I think that is was a rather an unrealistic test since there was inherent knowledge of what would be the expected difference between the simulated and real maps. A better test may be to present 20 pairs of maps simultaneously and have subject try and group them into two groups. that way the null hypothesis would be are the landscapes perceptibly different.

Score: 12

Looked for linear patterns. Also, looked at patterns adjacent to what looked like water features (dendritic drainage).

Score: 14

Initially I looked polygons that appeared to be water courses and looked to see if they behaved as water courses do. After a while I lost confidence in my ability to actually do this. On some maps one of the pair had an unusually large number of individual pixels of a land type and I assumed that these were artifacts of the simulated map. In others, however, I selected as real the map having more detail. In all, I really did not feel confident in my ability to detect the real map ... I was in a guessing mode most of the time. I'm actually surprised that I got a score >50% !

Score: 13

Hi Bill:

It seemed to me that each pair shared some EXACTLY identical patterns and each pair had the same compositions. There were differences in terms of spatial arrangements between the real and hypothetical maps.

I would be interested in learning the detailed mechanisms of generating the hypothetical maps. For example, did you use some parts of the real map when you produced a hypothetical map?

The program could be of great significance. I wish I could have done the test earlier. Thanks.

Score: 9

Some patterns seemed too ``perfect,'' but these may have been the real ones for all I know. Most sets had some elements that seemed real on both images and some elements that seemed too ``perfect.'' I usually resolved the onflict by looking for associations. I'm sure I made some assumptions about the gradient from stream courses to mountains. This was based on thin-line branching patterns, assumed to be streams, grading to what ever seemed opposite and distant. Note that my experience is strong with natural real landscapes, poor with simulated landscapes.

Score: 10

I predicted I would get 50% correct since it was essentially a random process. I had no idea what I was looking at and had no basis for making my choices. Real landscapes can occur in all kinds of patterns -- there is no pattern that I wou ld say is impossible.

Score: 10

I looked for dendritic patterns that would indicate waterways and then for strongly contrasting patterns on either side. I reasoned that the veg. should not be terribly dissimilar since the elevation, water table, etc. should be similar. This reasoning may break down in areas of steep slope, however, where there could be significant differences between N and S slopes leading down to a stream, for example. I also looked for strong fractal patterns such as you would see in a Mandelbrot book and assumed those were the Realizer rather than the real map. These lines of reasoning may be way off base, though. It was hard to make the decisions.

Score: 13

I looked for patterns that seemed ``out of place''. If these patterns were very geometric - square or rectangular - I assumed they indicated actual features. Other patterns did not seem to mesh well with other categories; I took those to indicate fractal-generated features. Very convincing, though!

Score: 10

I couldn't help but look for geologic trends, since that was my background, but I always found themes in each map which could be such a geologic map.

Score: 16

My bias tended to be towards eliminating the `salt and pepper' pixels. I tended to go for the smoother boundaries, and I carefully assessed what I thought looked like stream networks. Nice job on this! I can bet it will be very helpful to many researchers!

Score: 13

Relative shapes of the different categories with respect to one another. The easiest to judge (I think) were the ones with lots of graininess in what I assume were the fractal renditions.

Score: 14

I looked for overall connectivity, linear features and realistic gradients.

Score: 13

repeating patterns of association between selected colors, network topology, consistency of features across image

Score: 15

At first I attempted to distinguish ``reality'' as something with greater detail. Later, I selected based on smooth curves, e.g., several maps seemed to have broad curvilinear structures . . . the appearance of ridge/valley terrain in the eastern U.S. In many cases, it was little more than a coin toss. In several cases, the two maps were obviously different but I really could not recognize which was real.

Score: 16

I didn't really think it was a useful test because I had no idea what I was supposed to be looking at and where? geology in tennessee? vegetation in northern ontario, etc?

Score: 7

Complexity of landscape. Size and association of different cover types.

Score: 11

What was confusing to me was the fact that some patterns were held constant in both maps therefore it was not a completely true test of how well you could totally simulate all patterns. Often the class held constant was some of the dentritic patterns which I would think would be the hardest to perhaps truly simulate. It would be interesting to see what people score if you tell them what the cover types are--because in reality that is what one would know and if you have all class types simulated. Also, I would think the ability to correctly identify would increase as you had fewer classes. Might be interesting to test. I will look forward to your results.

Score: 12

When the map fragments contain no landscape meta-information, it's almost impossible to distinguish. However, when they do then at this stage of the realizer it can be a dead giveaway. I suggest this might also be a function of modern classification techniques and the like, which can generate rather information-poor maps. An ideal fractal ralizer should generate good maps that convey a sense of landscape at all levels. Good job! That was fun.

For additional information contact:

William W. Hargrove

Oak Ridge National Laboratory

Environmental Sciences Division

P.O. Box 2008, M.S. 6407

Oak Ridge, TN 37831-6407

(865) 241-2748

hnw@geobabble.org

The Fractal Realizer
Last Modified: Thursday, 17-May-2007 11:55:23 EDT
Warnings and Disclaimers