Have you ever heard of the Voynich manuscript? It’s a mysterious book, possibly created in the early 15th century, that contains weird illustrations and text written in an unknown script and language. Since its rediscovery in 1912 by Wilfrid Voynich, it has eluded the decipherment attempts of generations of cryptographers. The Voynich manuscript is a fascinating piece of history that has inspired many novels, games and films. Amateur cryptographers can find the latest news and research on the Voynich manuscript and other uncracked ciphers on Nick Pelling’s blog Cipher Mysteries. He’s also the author of the readable non-fiction book The Curse of the Voynich.
To celebrate the publication of the 500th Sandra and Woo strip, I have decided to publish “my own Voynich manuscript”. So here it is, The Book of Woo! As you can see, it resembles the Voynich manuscript in several ways. But of course we couldn’t create 240 pages, 4 had to be enough. Unlike the Voynich manuscript, The Book of Woo definitely contains sensible information that can be deciphered. I guarantee it ;-). And I will pay the person who is able to provide a decipherment that’s sufficiently close to the plain text a reward of $500. Send your decipherment attempt(s) to novil@gmx.de. I would also love to hear about your general ideas or statistical analyses that you carried out. There is no deadline. I will not publish the solution until at least strip #1000.
But be warned: It’s a huge challenge and I don’t expect to receive a valid decipherment at all. It’s primarily a work of art, not a puzzle for the general public. I believe that only experienced and dedicated code breakers have the chance to succeed. A lot of time was spent on the encryption. If you think you can simply carry out a frequency analysis on the letters and be able to reconstruct the English or German plain text this way, well, that’s just a waste of time. However, to make things a little easier, I want to give you the following hints:
- The encryption isn’t based on an algorithm only suitable for computers which executes a loop 100 times or something like that.
- The encryption isn’t based on some sort of device or mechanism that is hard to get.
- No “classical” steganographic method was used since that would just be impossibly hard to crack.
- The plain text is some sort of literature, as one can guess from Woo’s comment and the illustrations. A lot of time went into the plain text as well, it’s not just a copy of the first page of Rascal or something like that.
You can download larger versions of the four pages of the Book of Woo here:
[Update: 10 August 2013] Everybody who is seriously interested in deciphering The Book of Woo should read the comment section. There is a lot of interesting information in it.
[Update: 31 March 2015] The Book of Woo Wiki, now maintained by our reader Chris, also contains valuabable information for anyone who’s trying to break the code. In case the wiki should go offline sometime in the future, I created a complete backup of the wiki’s content on 31 March 2015.
In other news, the winners of the Sandra and Woo and Gaia fanart contest 2013 have been posted.
Thanks to everyone who participated!
- Sandra: Hey, Woo, what are you writing?
- Woo: Oh, just a little story.
- Sandra: Really? Can I have a look?
- Woo: Sure, I’ve just finished it.
- Sandra: What in Voynich’s name…?!
|
another thing i noticed that would support the name theory for the single sign ryan named & … whenever it appears it is way larger than any other sign…it sticks out
that is probably meant to give it more importance/significance
even if it is not a name it most likely is a single concept thats pretty important
I hardly ever comment on anything on the internet… But this is just exceptionally pretty. If someone deciphers this or not. It is a work of art. <3
Novil, you are evil.
Also, very rich. Prize money for artwork, prize money for cracking the code. Seriously, how much are you bringing in on this comic?
Okay, since there are big hints at the text being encoded, rather than being an actual language (existent or not) I am less interested in actually cracking the language, I will add however, that the images on the third and fifth pages seem to show something related to capture, and the release from capture. Also, maybe Aztec(?) mythology could be a useful place to look (Is there a turkey god and some form of fire-breathing bird?)
And on page 3, the right picture seems to depict a city of sorts, but with forest-like traits (It has branches, but also windows and electricity poles). So it can either signify the free world (where raccoons can live free), how raccoons see human cities, or a Raccoon city.
Now on to the first and second pages: The first page has two non-raccoon animals, intertwined, they seem like… weasels?(Long mammals, I am not too well versed in mammal names and types =P). They are surrounded by something which appears to be a halo of sorts (Judging by the golden colour and the spokes radiating out), ringed by flames, could signify the sun, or a creation from fire. From one of the flames spouts the tree, judging by the leaves, it is an Oak tree, assuming the leave size is symbolism, and not realism.
The raccoon depicted climbing the tree is almost guaranteed to be the Raccoon goddess, however note that she is not wearing the amulet she is in previous depictions. (Take note of the amulet though). The paws of the raccoon goddess have halos around them, which is logical, as raccoons have very sensitive paws, (As told earlier on this site) and the author has already shown to find the paws important to raccoons. Thus the Halos could either indicate her holiness, or alternatively, wards against the black, evil looking creatures which are a bird of prey, and a wolf, both of which would likely be predators of the Raccoon.
Thus I believe the first page might be the raccoon goddess warding off *evil*. It might be a creation myth.
Now on to page two. first off, the symbols have been noted before in some post, the cane, indicating the craftiness and stealth of the raccoon, the top-right symbol is a paw, probably of a raccoon, now take note the similarity with the amulet the Raccoon Goddess is wearing here: http://www.sandraandwoo.com/2013/07/04/0496-listen-to-your-stomach/
The bottom symbol could mean hearing, but it could also mean rays of the sun, since those are occasionally depicted as waves. In the picture dominating the page, the raccoon goddess can be seen interacting with normal raccoons, all of which have their eyes closed, and are kneeling, possibly indicating respect, and/or prayer. Again, there is a halo around the goddesses hand. In this case, it does not seem to depict so much that she is holy, but more that she is performing some supernatural feat. Perhaps she is gifting the raccoons with cunning(first symbol) sensitivity of touch(Top right symbol) and superior senses(bottom-most symbol).
That is all I can read out of the images, I hope it helps =D. If anyone has anything to add, or their own opinions about things, I would be glad to discuss them!
As someone whose keen on making his own languages and loves decoding things, I just HAD to take a shot at this. At first it was pretty daunting until I realized a few quick hints from basic observation.
1. Marks of punctuation. There is a colon, periods separating sentences, quotation marks, and even exclamation points. This tells me this is something that’s been written in a language surrounding the Western “English” and like languages.
2. Hints given by Ryan (and just general suspicion) this is a trick of the letters. This simplifies things by A LOT. Languages and some fictional ones I study don’t even use alphabets.
3. So because this is an alphabetic language, I started with seeing how many different letters we have. It is likely and possible to be more complicated than that in regards to actual interpretation (a phonic sounding trick in there maybe as a secondary layer). This is probably going to take up far more hours of my time that I really want or need.
Marcus wrote:
Have you started on any languages yet? I personally am also interested in fictional languages and such (I have taken several jabs at translating the Dwemer language from the Elder Scrolls series and am currently working on my own language =P). If you have already started, is there anywhere where I could see it?
Woo-ha! Great piece of art and I just love Mysteries! 😀
@ Mac Johnson:
Ah, I was just about to mention that myself.
I wonder if it might be an Enigma-type cipher … though that might well be “too” complex, given the “rules” of this game. The cryptographers working on cracking that usually looked for repetitive sequences to start from…
The trap is to assume it’s one particular favourite word when it could still be a great many… Going along the (false?) letter-substitution lines, if we assume it’s “The” at the start of each of those lines (with some words then extended, eg into “There”), then look for those symbols elsewhere, we can quickly see that it’s a dead end idea because there’s plenty of words which would be, e.g. ?h?h or similar, which isn’t common in English and I doubt common in German (“Shah” is about all I can think of). But other common sentence-starters like “She”, “Her”, “Its”, “And” also suffer similar issues.
My current pet theory is that it’s a phonetic language, possibly even syllabic, though that may be less common. If the Raccoons have their own script, there’s no good reason for it to carry all the many thousands of years of corrupting baggage that something like English has (where many years of schooling are required to be able to read and write it “properly” thanks to how many departures it takes from both the spoken language, and logic in general). I’ve seen e.g. sci fi comics which use a phonetic “reformed” alphabet which combines somewhat altered spellings of familiar words plus a rather funky (but still mostly “roman”) font to great effect, providing a nice dissonant sense of e.g. billboards and direction signs etc not being immediately comprehensible to the reader, but becoming more so with a bit of study and thought.
Also, I’m leaning towards it being German in the end. Both the English and German comics have the same symboltext, even though the strips above them are translated. It would be easier for Oliver to come up with it in his native language (most of the strips themselves are conceived that way then translated, IIRC). And the strange word length frequency peaks would also suggest that, just with the actual peak figures differing because of e.g. a phonetic transcription.
To that end I suggest we break out the universal translators and go hook up with our cohorts in the opposite-world of the german discussion thread… they might be halfway to solving it already, who knows. Pretty certain us lot over here haven’t a snowball in hell’s chance without at least a rudimentary working knowledge of the language plus a paper translation dictionary to hand which we can leaf through for ideas and make notes in.
Without having a look at my e-mail inbox if somebody already broke the code, I want to announce the following changes:
The successful code breaker will only receive $250 since several readers have already published useful information such as statistical analyses. If the code is broken, I will also pay $100 to two charities determined by these readers. I will also spend more than $150 for a cool idea that will several people quite happy. This is not related to breaking the code.
Is Woo related to Sly Cooper? That all I could decipher from this.
PS, don’t touch them, they’re made out of antimatter.
PPS, this quote just appeared across the top of the site after I posted that, I wonder if it’s a clue?
““Ssssao fem Iha wl aiv bl olwv rised.” – Antonio Averlino [1465]”
PPPS…
midivilplanet wrote:
…Sorry dude, that’s just not how binary data-word widths work. You only need FIVE bits to encode 32 different characters (or 31 + a space); “32 bit” encoding would give you more than FOUR BILLION different characters… even Unicode usually stops at 16 bits wide, and that’s enough (with a few code-page switching symbols) to represent every symbol in every human language on earth.
However, that said, you MIGHT be on to something there. It could well be a variant method of representing, say, the bitstream of a Baudot-Code based Telex or similar, if there are absolutely no more than 32 unique things that occupy each character space. Take the symbol, work out what 5-bit code it represents, then see if that can be made to work in the Baudot realm (which usually has at least 62 different actual outputs, with one or more codes reserved for codepage-switching). Or maybe it can even end up as a bunch of 5-bit symbols that are formed from breaking up 6-bit EBCDIC or 7-bit ASCII into a continuous bitstream then chopping it up a different way. That is, technically speaking, something you can do by hand, even though it’s a huge slog to get through.
PPPPS (the last one I hope)
Note a change to the T&Cs / prize structure that’s happened in between me writing my two posts, even. Maybe we (or the Germans?) are racing towards the answer faster than expected, so it would seem rash to give away such a sum on a whim… instead some of it goes to charidee (mate) and some of it’s also being reserved for some other interesting thing. Fair enough, really…
@ tahrey:
Ah, Oli just announced the change anyway.
PPPPP(?)S: another quote appeared in my browser —
“Psidms sid Vjj cp ktmr ojgd.” – Thomas Jefferson Beale [1820]
…worth looking up these people and collections of their famous quotes to see if it could give us an “in” to the Raccode?
OK, I think I saw Seeohmhtlahmakasay written in the text. Third page, eighth row, first word, there seeems to be a single 18 character word that could be the name of the goddess. Maybe the letters depend on were in the word is the character.
The problem is that with my complete lack of German (aside form BItte or other simple words) makes it harder to see if the lettters assigned to the positions I’ve given them can form easily some words.
I’m no expert decoder, but I’m interested in crypto, so I’ll still give it a shot.
@ kurokotetsu: Sorry, second page.
P^6S
The first two words could, depending on your exact reading of the raccoon god’s name, both work out to syllabic-character renderings of “See Oh Ta La Mas Kay”… (with the difference being in how you divide up “Maskay”, as it’s a more complicated set of sounds than the others, which are very nearly phonemes rather than syllables…)
One last one as I have something else to go and do
“Andrsn nds Chh fd smosud wsntdkdfah.” – Sámuel Literáti Nemes [1833]
(P^7S: If you’re making up a -practical- language that’s supposed to be easily understood by its readers, you DON’T incorporate a symbology where the gross phonetic “meaning” of a particular character jumps around depending on where in the word it is… at least, not much, and not on a randomly changing basis. Yes, English and some others like Welsh do this, but if you were making up a language from scratch you’d leave that kind of behaviour in a dumpster at the side of the road. Even so, their changes and mutations happen in known, predictable fashions. Depending where you sit along the “invented script” to “complicated cipher” line, you might want to bear such in mind 😉
OK, I lied. But seriously, as I don’t have anything else to write below this (and thus come back to check), this IS the last for now.
“Kaagia aag Scc as gr uie srn.” – H. C. Reynolds [1948]
I thought i might post a list of words, names and phrases that I belive have a high chance to be part of the plain text. I hope that it will help the decrypting process.
-God, Godess, Deity, Seeoahtlahmakaskay
-Raccoon, Masked ones
-Yggdrasil, The World tree, Tree of Life, Ash tree
-The Unnamed eagle, Vedfolnir, Hawk, Storm pale, Wind-witherer
-Wolf, Fenrir, Geri, Freki, The Ravenous, Greedy one
-Ratatoskr, Squirrel, Messanger, Drill-tooth, Bore-tooth
-Children, Offspring, Kit, Disciples
-Power, Gift, Blessing
@ tahrey:
I doubt this is a syllabic or phonetic cypher. I entretained the idea, but it seems no good with the number of symbols. I’m counting over 40 different phonemes in English (assuming that it is tha language in the plaintext, but I think German has more or less the same amount) and the possible combinations of that (to make a syllabic alphabet like Hiragana and Katakana in Japanese) are far too big to be coded in only 32 characters (even considering that there are some restricted ombinations).
@ Ryan:
I actually suspect that the stand-alone symbol (&) may represent the raccoon Goddess. As she seems to be important enough to the story for that sort of thing.@ Mr_Nabby:
My first thought when I saw “&” was that it probably represents a name. My guess is the raccoon goddess, as she’s definitely important enough for that sort of thing.
On another note: I like the theory that the 30 other symbols are probably the 26 letters plus ä, ö, ü, and ß. If this is the case, it would also support the code being, at least somewhat, German based.
It is very interesting to hear all your theories. I have not yet received a decipherment attempt, by the way.
Intresting these set of characters looks like a mix of characters from two such sets i made myself.
would almost be curious to give decoding it a try.. if i could spare the time for it…
Still like Randall Munroe’s take on the the Voynich Manuscript:
http://xkcd.com/593/
Novil said that content exists inside the encryption, I didn’t notice any guarantee that it is intelligent content.
I imagine that Randall could encode a bit.ly to an ASCII txt animation of Rick Astley’s Never Gonna Give You Up video.
I have no experience whatsoever with encryption (though I did invent a few sign languages and simple codes as a kid) but I would LOVE to really get into this.
Sadly, it’s examinations weeks for me, won’t be able to get into this till the mid of August…
However, if anyone needs help with the German language, I’ll be more than glad to team up (native speaker here), I’m not sure I’d stand a chance on my own anyway.
One bit of information that might be interesting to know: If the writing’s a bit repetitive, as in strings of letters reappearing in different words and so on, that may be a be a hint that it’s German or bits of German.
German’s a hell of a repetitive language, using a whole lot of word stems and syllables and inflections to create a vast range of different words with different meanings.
(e.g. absagen (to decline), entsagen (to abdicate), versagen (to fail), vergessen (to forget), vermessen (to measure), verfressen (gready – regarding food)… the list goes on. Endlessly so… I occasionally have some fun sifting through the German language for this kind of words)
I decided to give the following hints so that nobody follows a completely futile path:
“Fraqen naq Jbb” is just rot13 of “Sandra and Woo”. It’s followed by two anagrams for Oliver Knoerzer and Puri Andini.
The four lines at the top of the website:
Ssssao fem Iha wl aiv bl olwv rised. – Antonio Averlino [1465]
Kaagia aag Scc as gr uie srn. – H. C. Reynolds [1948]
Psidms sid Vjj cp ktmr ojgd. – Thomas Jefferson Beale [1820]
Andrsn nds Chh fd smosud wsntdkdfah. – Sámuel Literáti Nemes [1833]
are also unrelated to “The Book of Woo”. They’re little cipher challenges of their own, using standard procedures. One of them was already solved by our reader Roachester on 15 September 2012:
“Psidms sid Vjj cp ktmr ojgd.” –> “Sandra and Woo is pure gold.”
Gods all damn it, I don’t know German.
I’ve seen this story before. I’m trying to place it. It’s not norse mythology, it’s an old german tale, but for the life of me I can’t remember it’s name.
OK two “wild” suppositions about this crypt I think I’ll start working on.. First, it is a symmetric encryption (I doubt Novil is that sadistic) and there are at least two different cyphers applied to the text.
I’ll see if that leads somewhere.
it’s hard to tell if this is a red herring or not, but these norse runes http://en.wikipedia.org/wiki/Runes seem to resemble a lot to the characters in this alphabet. http://www.ballwithhat.com/wp-content/uploads/2013/07/abc.jpg
this is going to bug me forever, if i dont figure it out.or atleast learn the answer, now i’m gonna try it.
Using Ryans alphabet and transcription:
Even if it is not a simple letter replacement, I think the frequency analysis is interesting:
= n j d m p r w z v a y e u x
152 146 143 141 105 105 105 105 99 96 76 74 62 62 61
s l i > $ o # g h c b f q t k &
59 56 54 53 45 42 38 37 37 34 26 25 23 13 11 8
(I hope this gets displayed properly, otherwise copy it to a spreadsheet)
We have 4 very frequent letters of similar frequency (7.3-6.7%), then a big step, and then a slow drop for the remaining letters. The small step between v and a is not significant. This is different from English/German, where e is the most frequent letter (17%/13%, followed by n/t with 10%/9%).
The entropy content is 4.7 bits per letter, above those for English and German (~4.1). So far, some word-by-word algorithm to replace letters could produce this. As we have repeating words, I think that the translation of a word depends on that word only.
There is some structure in two-letter correlations. As an example, # is always followed by n, u, v or = (with one #b as exception). Certainly not by chance, as we have 38 of them. It would be interesting to do a full analysis of this. Do we get a structure similar to English (sh, th, ph,…) or German (ch, ck, sch, ph, …)?
L and I seem to be typical letters at the beginning of words. 40 words begin with I and 37 words begin with L, with a total of just 54 I and 56 L in the text. B,W,Z do not begin words, and the entropy for first letters only is 4.4 bits/letter. That looks similar to English and German.
The easiest way I see to produce a cypher text with those features: Replace letters with numbers, take the first value as first symbol, then the sum/difference/… of the first two values (mod 30?) as second symbol and so on.
It does not give the 7/9-lettered words, but I guess that can be explained if we combine common word groups.
There are more options, of course, but those are harder to find.
I agree that & is probably a name, or a symbol of special significance, and not a regular letter.
I did not see anything in the German comments not mentioned here.
[…] Sandra And Woo has just taken a detour into CryptoLand, with a Voynich-inspired page called The Book of Woo to celebrate its 500th edition. What’s more, author Oliver Knöerzer (AKA “Kernel […]
based on the following (without qutes, space, comma…) modification of the posted transcription (thx a lot to Ryan) i get the following statistics
1: 38 = 1,82 %
2: 53 = 2,53 %
3: 152 = 7,26 %
4: 8 = 0,38 %
5: 45 = 2,15 %
a: 76 = 3,63 %
b: 26 = 1,24 %
c: 34 = 1,62 %
d: 141 = 6,74 %
e: 62 = 2,96 %
f: 25 = 1,19 %
g: 37 = 1,77 %
h: 37 = 1,77 %
i: 54 = 2,58 %
j: 143 = 6,83 %
k: 11 = 0,53 %
l: 56 = 2,68 %
m: 105 = 5,02 %
n: 146 = 6,98 %
o: 42 = 2,01 %
p: 105 = 5,02 %
q: 23 = 1,1 %
r: 105 = 5,02 %
s: 59 = 2,82 %
t: 13 = 0,62 %
u: 62 = 2,96 %
v: 96 = 4,59 %
w: 105 = 5,02 %
x: 61 = 2,91 %
y: 74 = 3,54 %
z: 99 = 4,73 %
perhaps the most frequent letters (http://en.wikipedia.org/wiki/Letter_frequency) have 2 or 3 symbols to fool statistics. for the less frequent letters i get at least some correlation…should do the math when less tired …
——————–
another thing:
ich you check for quotations along the older comics you find the following ones (each related to some kind of crypto thing)
“Ssssao fem Iha wl aiv bl olwv rised.” – Antonio Averlino [1465]
“Psidms sid Vjj cp ktmr ojgd.” – Thomas Jefferson Beale [1820]
“Andrsn nds Chh fd smosud wsntdkdfah.” – Sámuel Literáti Nemes [1833]
“Kaagia aag Scc as gr uie srn.” – H. C. Reynolds [1948]
the order seems to be random…but perhaps if you check the order of all 500 it may be some kind of base-4 code
…oh i just read the hints…..maybe last thing has nothing in it
@ steffen:
blockquote was not good, so here again the modified transcript for analysis: (based on the version Ryan kindly provided)
lehvrnsvrnzrnmnsnvzmniuoypj5ypjwpjge1n2ufdwfemivhnh3mzr3svrnibrnzr3uadwpjwkev42nsvt3cvm2uadaxdodpqljfyxjajwpdem3zrnuxyawpemzrn1ualyojwpjfqgyywfemivhn1ualyojwpjge1vvzmnuwdpdldyw5d5d5dwljaem3svrnzrn23m3vzmnsnsnc3h3c3h3zt3mi3s33rvi3s3zrnc3h3zc3mzs32nibmuwxjxjajdpdgyaxec3h3cvm2ugdodw5dpygywpdem3svrnzrnrb1nmvz1vmz2nsupjwyaxdfdpdgdodxem3zvmiuadxem3zr3iuadem3zr3z1ualyojjfywpj5ern3r3vzuad5ypjgyaxeibmuwpdem3r32vcvm2uxj5ykdwpdae13ibmuwpjxdadywadljagdodem3zruagdaw5dxj5ykdwpemc3h3zs3ibmuxjajwxjgdodlqpjwpdem3zrn13s3c3h3zs32nmns32nmnzrn13inibrnruaw5dwxjfdljlqpjwxjfqpypdlez2ncvm2udpylehvrnsvrnzrncufjxe13z1ualyojwpj5d5d5ypjwxjlehvrn3rvz2ninm2nmz2n42nsvt3sniuoypjwpjld5ddpdywlehvrn3mcvm3inmzmnzr3sniuoypjwpjld5d5ypjlqpjwpjxjpjapqfdywlehvrn3rv1v2v1vmrb13ivhnz2nuadwxj5ypjlqpjgdawajwpdododlyojwpjgdodwpemrb13z2nzsnsbcvzrnincvrvmvzm3inmz2ns3z2n13inibrnm3inmivhnz2nsvrnibrnzrnzruapdodw5j5qgylehvrn3rnz2n42nsvt3zrnhnrvc3h3i3s3svrnibrniuoypjwpjfqgyywaj5jwxjem3ivhnzr3iuoypjwxjljaxjadpywxj4xj5ykdwpjdpdldyw5dwljagdodxem3zvmiuadxem3zr3iuoypjwxjljaxjawxj4xj5ykdwpjdaxdywxdfdpdlehvrnz2n42nsvt3zrn1v2v1vmsbcvvzrb13zrn1v2v1vmrnrnvzh3h33mcvcvm2uljfywxjadaxdodpqwpdlehvrnzmnzrnsbinzruaw5dxjfdljlqpjwxjpdlew2ncvm2udpydpdwpjdagygdlecvm2uljfylqaer3zt3miuadwodawpjfd5dgdodljaxjawyaljaxjawpemzs32nzs3mvm3z2nnmi3ibrnt3mzrn3m23vzh3mvz3rv3rvzr3zrnruaxe133mcvt3miuadwpjgdodw5dw5j5jwpjgdodw5dwxjkdalem3t3msbcvzrn2nrnmn1vc3h3zsnt3msbcvzrn2nrnmupjagdodw5jkdapjpjwpjxjpjaywlem3zruaxjw5jlehvrnsniuopj5qgywpjfd5dwkevz2nrnm2313r3vzsupjwpemcu5eibrnsualqgdlehvrnsbcv3mcvzrn13s3ztuywxem3z1nmruaw5dwxjfdljlqpjwgdafjoyawyafjoyawxjfqpydpdwxjwkdalem3zmnt3miuad5qgywpjwpems33mcvvziuoypjwxj4xj5ykdgdlecvm2ulypjwxjw5ddpywpdodler3h3n1vzvmh3iu5ernn1vzrnuxyaywgesuqgdgdodxdpjdpdgdodlehvrn23rnz2nm3m23rnrniuoypjwpjdaxdyw5dwlja5jlehvrnzrnc3h3nmi3zs3z2n13inibrnc3h3nmi3zs3z2n1nhvmzvm1nhvmz2n1brv3r3sniuoypjwpjfd5dlem3sbcvs3s3sntvzvms3s3svrnzrn3m23c3iugyaxe13s3zr3iuoypjpjpjwpjlyojgyaxemnz1nmzr3m3m23z2niuoypjw5jwpjlqpjgyaxe13s3zr3m3m23zrn13s33r3rnrnrb13h3mzrnzruaqgdfjoyawxjjpesupjpqfdwgqwodapjwpemzs32nsbinzvmsupjdpdgyaxez2niuoypjwxj4xj5ykdwpjwlja
This sure would take a lot of time to decode! XD But I think if each symbol directly translates to a letter in the alphabet, You might use methods similar to the one used in The Gold Bug by Edgar Allan Poe. Here it is: http://xroads.virginia.edu/~hyper/poe/gold_bug.html You’ll have to scroll down a bit to find the decoding part, but you should read the story too, it’s fascinating.
Took me 5 seconds to figure out
“once upon a time there were raccoons. They did stuff. The end”
Now where’s my $500
I think the symbol that looks like a large loop with two circles at the base, always standing alone, is a word by itself. Maybe it stands for Seeomatlakashkay? Because a word that long would be a pretty strong clue.
Also, I have noticed that the symbol like an H with loops on top often precedes the two-topped T. But since I don’t know German, I can’t actually do anything with that.
Also, remember some symbols may be nulls, that have no meaning at all.
Oh, all right, I’ll contribute.
Look at how similar it is to a substitution cipher, statistically! Kind of amazing how there appear to be repeated sequences in many places, despite the fact that it is presumably a sophisticated cipher. But the repeated sequences seem somewhat local. Maybe there’s a slowly time-varying component to the encoding? Or maybe the reason for the repeating sequences in the ciphertext has nothing to do with the idea that there are repeated sequences in the plaintext? Maybe it’s an artifact of the encoding instead? Some more statistics should settle this. I’m going to fire up Octave and crunch through it all a bunch of different ways.
i unfortunately do not have the time to run these numbers myself, but some interesting things i have not seen mentioned in the comments yet:
– punctuation as plain text: someone mentioned it earlier but i think it is important to note that we are lucky that the punctuation is in plain text. this is almost a certainty and gives us several important hints:
— near character to character conversion (analysis of sentence length for this manuscript will probably prove that they are within normal range for english/german words and sentences)
— space is probably counted as punctuation (plain text/not encoded). there is no instance of a space interacting oddly with punctuation.
– alphabet system fits in 31 characters … near base 5 … the 3 trailing characters in the distribution (t k &) look VERY suspicious as “leftovers” when doing base-system changes (like the = or / in most base64 encodes of binary data). i think you might find some very interesting results if you convert all the data to base 5, then decode in base 4 or base 6 (or 7,8,9,etc) (in this case, you should encode the spaces as “00000” since they would still appear as spaces in the new encoding if you translated “000000” or “0000” to the space in the new system). the reason i think base 4 is possible is that by using a mixture of german and english it is possible to use nouns/verbs in such a way that you can write a document that long with only 16 latin characters.
– frequency of character use in the location of each “word”. another thing supporting the base conversion theory is the fact that there are a lot of characters that appear in the same location in each word… why is the “n” character so frequent just before a space?
– “No “classical” steganographic method was used…” … does not mean “no steganographic method was used”. everything appears to be handwritten. main thing there: “appears”. i wasnt able to find any easily traceable differentiators between the letters (jpeg artifacts and anti-aliasing make me want to believe Novil wasnt evil and encoded based on color counts or character size) but they ARE different. if nothing else, the letters move up and down slowly against a fixed horizontal rule. these could easily be used to add +1 or -1 encodings to the character encodings making per-character frequency analysis useless (as Novil mentioned it was). a frequency analysys of each character (character height/width?) against a horizontal rule may show some interesting results. at the very least, examining the text for some sort of artifact that would prove some “work of art” is taking place on the letters is worth it. (http://www.robinsloan.com/penumbra/ is an interesting book if you havnt read it or need a nightlight)
@ nebosuke:
err when i said “base 5” i mean base 2 with 5 bit characters… and when i said “base 4” i meant base 2 with 4 bit characters. sorry.
@ mfb:
i dont speak/write german so i have no idea how common umlats are… but wikipedia “german alphabet” gives another interesting take on what those two letter combos might be: “Ä, Ö, Ü, ä, ö, ü should be transcribed as Ae, Oe, Ue, ae, oe, and ue”
I don’t have time for this and too lazy anyway…
I happened to notice that in the ‘articles’ section of this site you posted a diagram which showed the number of daily readers who visit the site, and I believe the statistic for this was around 17,000 people in 2012. I am curious as to know the number of daily views now, just to gain a rough idea of how many potential code-breakers we may have and also to find out if you reached your target of 25,000 people per day.
Well, I have very limited understanding on cracking codes. I know how some of the most famous codes in history worked, and understand how they where cracked. Such as the German’s Enigma code, but I don’t think that will be much help here.
word length frequency seems off to me too. someone should analyze that. also… it appears the characters have some shared properties with each other (horizontal, vertical lines, ~ looking things, / and \ looking things, etc.). given that character replacement wont work and a straight filter wouldnt work (for instance, making every _ a dash and every | a dot and then using morse code) because the word/sentence lengths would be wrong, i propose there may be something more insidious using the word length to specify the number of characters then using the number of ~, /, \, etc. in the word to specify which character is what. that way each word could be completely scrambled and the same word would not need to be encoded the same way twice (even though we see that in a few cases on page 3).
the fact there are not more 2 and 3 letter words (articles in english) make me suspicious of this (but again, could have been removed or just using german…)
I seem to be taking a different approach than you guys, so I thought I’d share.
Because of the plaintext punctuation, repetition of “words”, and preservation of spaces, grammatical analysis is possible. Therefore, several things become apparent: Every sentence makes grammatical sense. There are no commas, so it is a simple grammar. Therefore, most sentences will have a verb single verb, and every sentence has at least one. Most words will be nouns. Most sentences will be declarative. Short words are more likely to relate to the grammar, while long words are more likely to be nouns. Words naturally have endings, which change based on usage. There are several short sentences which are perfect for guessing parts of speech.
Next, I’ve noticed that in this entire text, there is not a SINGLE repeated letter. This is statistically highly unlikely. This means one of two things is happening: One, there is some layer of encipherment specifically to remove them; or Two, great pain and care was taken to avoid them. It is much more likely that the double characters were removed; therefore, we should be looking for things like “double the previous character”, strategically placed null character(s), “ignore the next character”, and choose-by-hand poly-alphabetic ciphers. If there are strategically placed null characters, exploiting this pattern should facilitate their discovery.
As for how I’m trying to solve it, I’m putting together a wordmap. I’ve more or less given up on figuring out the exact alphabet; instead, I’m trying to decipher words or phrases and backtrack from there.
The more patterns that can be exploited, the better.
Also, a small curiosity: The words in quotes on the first page seem important, but I don’t think they’re ever repeated again. :/
Also, I strongly suspect the plaintext is in english, for two reasons.
1) There appears to be a one-letter word in line 2 of the first page; it is a letter that is repeated several times inside other words, too. It could be a null character, though.
2) There seem to be more english readers of this comic than german ones. Just by counting the number of comments on the english and german versions of this site for the last dozen comics, I’d say there’s at least twice as many english readers as german ones. Being keenly aware of the comment sections in both languages, Novil knew this, and probably geared the interactive challenge toward the largest portion of the fanbase.
Is it also in a 26 letter alphabet? like A=_ and B=_ or is it entirely new structure?
words #>2:
(‘&’, 8)
(‘cvm>u’, 8)
(‘gdod’, 7)
(‘lehvrn’, 6)
(‘iuoypjwpj’, 5)
(‘svrnzrn’, 5)
(‘gyaxe’, 5)
(‘sn’, 4)
(‘iuad’, 4)
(‘t=m’, 4)
(‘lyojwpj’, 4)
(‘c=h=’, 4)
(’em=’, 4)
(‘=mcv.’, 3)
(‘ibrn.’, 3)
(‘svrn’, 3)
(‘#=in’, 3)
(‘fdlj’, 3)
(‘sbcvzrn’, 3)
(‘iuoypjwxj’, 3)
(‘ibrn’, 3)
(‘#v>v#vm’, 3)
digraphs # > 5:
(‘n ‘, 73)
(‘rn’, 66)
(‘pj’, 65)
(‘j ‘, 59)
(‘= ‘, 45)
(‘zr’, 42)
(‘d ‘, 41)
(‘wp’, 40)
(‘ i’, 40)
(‘jw’, 39)
(‘xj’, 38)
(‘ l’, 37)
(‘=z’, 36)
(‘. ‘, 35)
(‘>n’, 32)
(‘ s’, 30)
(‘m ‘, 29)
(‘dw’, 27)
(’em’, 26)
(‘nz’, 25)
(‘m=’, 25)
(‘ g’, 24)
(‘a ‘, 23)
(‘s=’, 23)
(‘wx’, 23)
(‘vz’, 22)
(‘vm’, 22)
(‘ua’, 22)
(‘ ‘, 22)
(‘ x’, 22)
(‘=m’, 22)
(‘vr’, 21)
(‘iu’, 21)
(‘pd’, 21)
(‘le’, 21)
(‘ c’, 21)
(‘cv’, 20)
(‘$d’, 20)
(‘od’, 20)
(‘gd’, 20)
(‘ja’, 19)
(‘z>’, 19)
(‘yw’, 19)
(‘r=’, 18)
(‘yp’, 18)
(‘h=’, 17)
(‘dp’, 17)
(‘nm’, 16)
(‘#=’, 16)
(‘w$’, 16)
(‘ #’, 16)
(‘ >’, 15)
(‘ =’, 15)
(‘lj’, 15)
(‘ax’, 14)
(‘do’, 14)
(‘mz’, 14)
(‘hv’, 14)
(‘m>’, 14)
(‘da’, 14)
(‘ya’, 14)
(‘ $’, 14)
(‘ d’, 14)
(‘ y’, 14)
(‘ v’, 14)
(‘aw’, 13)
(‘ad’, 13)
(‘uo’, 13)
(‘gy’, 13)
(‘xe’, 13)
(‘oy’, 13)
(‘e ‘, 13)
(‘ r’, 13)
(‘ f’, 13)
(‘sv’, 12)
(‘ib’, 12)
(‘u ‘, 12)
(‘c=’, 12)
(‘eh’, 12)
(‘=h’, 12)
(‘fd’, 11)
(‘n.’, 11)
(‘#v’, 11)
(‘$y’, 11)
(‘dl’, 11)
(‘zs’, 11)
(‘t=’, 11)
(‘=r’, 11)
(‘sn’, 10)
(‘v ‘, 10)
(‘in’, 10)
(‘mn’, 10)
(‘ p’, 10)
(‘=s’, 10)
(‘kd’, 9)
(‘>u’, 9)
(‘br’, 9)
(‘qp’, 9)
(‘lq’, 9)
(‘sb’, 8)
(‘pe’, 8)
(‘xd’, 8)
(‘>=’, 8)
(‘& ‘, 8)
(‘rv’, 8)
(‘ru’, 8)
(‘qg’, 8)
(‘ &’, 8)
(‘=i’, 8)
(‘$j’, 8)
(‘ m’, 8)
(‘ e’, 8)
(‘wl’, 7)
(‘nr’, 7)
(‘j$’, 7)
(‘py’, 7)
(‘zm’, 7)
(‘ly’, 7)
(‘d$’, 7)
(‘su’, 6)
(‘v.’, 6)
(‘i=’, 6)
(‘\n ‘, 6)
(‘ns’, 6)
(‘\n\n’, 6)
(‘yo’, 6)
(‘hn’, 6)
(‘.\n’, 6)
(‘yk’, 6)
(‘oj’, 6)
(‘bc’, 6)
(‘#n’, 6)
(‘zv’, 6)
(‘ u’, 6)
(‘ n’, 6)
(‘y ‘, 6)
(‘d.’, 6)
see where you can get with this 🙂
…Is it sad that I just realized that the Racoon Goddess looks like one of the brush gods from Okami? I think It just took the color to see it. And If it wasn’t supposed to look like something from Okami then I might just be seeing things because I absolutely love that game.
all I see are animals in the text o.o
For those that like puzzles, ciphers and the mystery surrounding the Voynich Manuscript, the upcoming game Mark Lane’s Logs: Project H.U.M.A.N. (dev blog at http://www.indiedb.com/games/mark-lanes-logs-project-human) will feature all these ingredients including the Voynich Manuscript, shown under a different light and story.
I bet at least one of the words are Seeoahtlahmakaskay and at least one is Raccoon
Hi there, i was bored so i wrote a poem. *likes poems*
If there is an mistake please forgive me,
im german.
Also i know its not the best one, because of the language.
Long time ago it was,
when demons attacked Seeoahtlahmakaskay,
sorrow and darkness was the cause,
a bitter painful day.
Raccons – she gaved ability,
so they found a way to speak,
to help her trough (gauntly) infinity,
the enemy, was now the weak.
Forwards to the underworld,
they were steady going,
where the cruel was unfurled
and the lava was flowing.
The keeper pointed them the path,
to defeat the foes,
to allay the demons wrath
and wherever the journey goes.
I don’t know anything about the German alphabet, but I think it should be taken into consideration that the writer of the strip speaks German. Especially when you consider that there are more characters than are in the English alphabet.
@ Dylan:
Most important additions are a letter which looks like a capital B, but the bottom line is not connected, and the vertical line goes beyond the bottom horizontal line. This letter is basically pronounced as a sharp S I believe. Furthermore, some vowels can have double dots on them to change their sound.
Any German people or linguists who know better, please do say so, my German is very limited.
You’re right. In the german alphabet are the additional “Umlaute” Ä/ä (ae), Ö/ö (oe) and Ü/ü (ue) and the “Scharfes S” ß (also used as “ss”). I think the text contains all of this four letters…
But I wonder what the last letter is…
(Sorry if they are mistakes, I’m german and my english is not well)
@ Phlosioneer:
Given that novil flat out said there were more levels of encryption than just mono-alphabetic substitution, I think your idea to try grammatical analysis is the most likely to succeed. it also fascinates me that a comment after yours mentions that the word “raccoon” (or even just “coon”) is likely to appear, because given the subject matter it would be almost impossible to write without using it. This means that if what you say is true, and double letters don’t appear in this, there is a high probability that finding out why, and in the process finding out which word means “raccoon” could be the most important step in solving this.