Have you ever heard of the Voynich manuscript? It’s a mysterious book, possibly created in the early 15th century, that contains weird illustrations and text written in an unknown script and language. Since its rediscovery in 1912 by Wilfrid Voynich, it has eluded the decipherment attempts of generations of cryptographers. The Voynich manuscript is a fascinating piece of history that has inspired many novels, games and films. Amateur cryptographers can find the latest news and research on the Voynich manuscript and other uncracked ciphers on Nick Pelling’s blog Cipher Mysteries. He’s also the author of the readable non-fiction book The Curse of the Voynich.
To celebrate the publication of the 500th Sandra and Woo strip, I have decided to publish “my own Voynich manuscript”. So here it is, The Book of Woo! As you can see, it resembles the Voynich manuscript in several ways. But of course we couldn’t create 240 pages, 4 had to be enough. Unlike the Voynich manuscript, The Book of Woo definitely contains sensible information that can be deciphered. I guarantee it ;-). And I will pay the person who is able to provide a decipherment that’s sufficiently close to the plain text a reward of $500. Send your decipherment attempt(s) to novil@gmx.de. I would also love to hear about your general ideas or statistical analyses that you carried out. There is no deadline. I will not publish the solution until at least strip #1000.
But be warned: It’s a huge challenge and I don’t expect to receive a valid decipherment at all. It’s primarily a work of art, not a puzzle for the general public. I believe that only experienced and dedicated code breakers have the chance to succeed. A lot of time was spent on the encryption. If you think you can simply carry out a frequency analysis on the letters and be able to reconstruct the English or German plain text this way, well, that’s just a waste of time. However, to make things a little easier, I want to give you the following hints:
- The encryption isn’t based on an algorithm only suitable for computers which executes a loop 100 times or something like that.
- The encryption isn’t based on some sort of device or mechanism that is hard to get.
- No “classical” steganographic method was used since that would just be impossibly hard to crack.
- The plain text is some sort of literature, as one can guess from Woo’s comment and the illustrations. A lot of time went into the plain text as well, it’s not just a copy of the first page of Rascal or something like that.
You can download larger versions of the four pages of the Book of Woo here:
[Update: 10 August 2013] Everybody who is seriously interested in deciphering The Book of Woo should read the comment section. There is a lot of interesting information in it.
[Update: 31 March 2015] The Book of Woo Wiki, now maintained by our reader Chris, also contains valuabable information for anyone who’s trying to break the code. In case the wiki should go offline sometime in the future, I created a complete backup of the wiki’s content on 31 March 2015.
In other news, the winners of the Sandra and Woo and Gaia fanart contest 2013 have been posted.
Thanks to everyone who participated!
- Sandra: Hey, Woo, what are you writing?
- Woo: Oh, just a little story.
- Sandra: Really? Can I have a look?
- Woo: Sure, I’ve just finished it.
- Sandra: What in Voynich’s name…?!
|
I have never decoded anything in my life, BUT I DON’T CARE! THE GAME IS ON! I must unfold the secrets of the mighty See-Kay!
I’ll let you in on a little secret, the voyche manuscript was most likely faked. Also, I hope someone breaks the code.
Well, at least we are assured that it is a cypher, rather than a novel invented language like D’ni. I remember seeing all the strange writings in Myst II: Riven and wondering whether it was possible or necessary to decipher it – the answers being “no” and “just the numbers”. The person who designed the language had a lot of fun with it, apparently.
Still, at least Woo is going to explain all to Sandra, in the interval before the next strip, where we won’t be able to see… isn’t he?
Woo: I find food and copious petting make for the best storytelling! }8●)
Given the size of the devoted fanbase and the likely amount of people with lots of time on their hands or paws(might take a raccoon to crack the code, and raccoons do like to crack puzzles), I expect the story to be figured out within a week.
cracking this would take a solid 50 hours of work. i’d rather just, like, go to work.
OrphanDidgeridoo wrote:
Optimist. 😀
My guess is, if Novil is so sure that the code would not be broken, then it’s not a code at all!
It’s just gibberish. So most likely, fans would be simply wasting their time.
I can’t help but think about Okami whenever I see Seeoahtlahmakaskay, especially drawn the way she is in the book of Woo.
And, to clarify, “the English or German plain text” means that the original text may be in English or in German, but you won’t tell us which?
Tsunami wrote:
It’s not gibberish, it’s a nice text!
Przemko wrote:
I wrote the story for “The Book of Woo” in English or German or both and an arbitrary number of encipherment steps (in a broad definition) have then been carried out that finally resulted in the text you can see above.
I can definitely now understand why you mentioned previously that this comic was the most time consuming… You didn’t make a puzzle like in ‘a challenging comic’, you created a whole other freaking language! How much time did this take?!
Two things i currently wonder about:
-Why is there the number 10000 on the top of the third page? Is it a clue?
Maybe it is binary, and if so, it’s 16 in decimal. That COULD mean that the code is based on 16 characters, not unlike the Younger Futhark, which fits because the text is much like runes. It’s as good starting point as any I suppose…
-There is one character in particular that caught my interest: .P.
As far as I can see, it only appear in the text alone, so it might not be a character, but a word, or a name?
I really am a amature at this, but I’ll give it a try…
Also, I believe I have it already:
There was once this raccoon that looked like she was straight from okami, and she set fire to trees by using the sun. However, the hawk and the wolf that lived in the tree were pretty p*ssed at this.
Once she had ran away from them, she started setting fire to other raccoon’s hands using this glowy ball thingy because she was a pyromaniac.
She burned the half the skin off of this one raccoon, took a picture and put it on her wall.
She then turned into a dragon and started setting fire to an entire forest. And there is a bird guy who likes to point at things.
I may be one or two word out, but this has to be pretty close.
Can’t wait for someone to solve this.
Shoot. I got really excited to work on this, but I don’t speak German. 🙁 Looks fun anyway. Good luck everyone!
Holy shit, Woo is a prophet!
I think that this will go a lot faster with all of our efforts combined. Towards that goal, I propose a standardized transcription alphabet, so we can discuss the cipher in comments, emails, etc. I have linked to my proposed alphabet – based on basic similarities in shape to English letters. I have found 31 characters (just on the first page), so I’ve added 5 extra characters, outside of the English alphabet: =$#>&. The text has standard punctuation. “&” appears to be a special character of some sort, as it is never contained in a longer word and is made of three parts.
http://www.ballwithhat.com/wp-content/uploads/2013/07/abc.jpg
(Sorry – old webcomic domain is the only one I have!)
So the first three lines would be
lehvrn svrnzrn mnsn vzmn: iuoypj $ypjw pj ge#n >ufdwfem ivhn. h=mzr=/
svrn ibrnzr= uadwpjwku v & >nsvt=. cvm>u adaxd odpq ljfy xjajwpd/
em=zrn uxyawpemzrn #ua lyojwpj fqgy ywfem ivhn. #ua lyojwpj ge#v/
(the / at the end of each line simply marks the end of a line – // would indicate a section break)
I hope to have a full transcription completed in the next couple of days, but would appreciate it if someone else made a parallel effort, so we could compare and find mistakes.
Please comment with thoughts, problems, etc. so that we can fix and improve before the whole transcription.
WOO!!!! BEST COMIC #500 EVER!!!
well…e is the most common letter used so maybe that’s a start?
Is that Sly’s crook on page 2? The whole thing looks amazing.
So I got bored, and finished a transcription. I don’t know how accurate it is, but here it is:
lehvrn svrnzrn mnsn vzmn: iuoypj $ypjw pj ge#n >ufdwfem ivhn. h=mzr= svrn ibrnzr= uadwpjwku v & >nsvt=. cvm>u adaxd odpq ljfy xjajwpd em=zrn uxyawpemzrn #ua lyojwpj fqgy ywfem ivhn. #ua lyojwpj ge#v vzmn: uwdpdld yw$d $d$dwlja. em= svrnzrn >=m= vsmnsn “sn c=h=” c=h=zt=m i=s= =rv i=s=zrn c=h=zc=mzs= >n ibmuwxj xjaj dpd. gyaxe c=h= cvm>u gdodw$d pygywpd em= svrnzrn rb#nm vz#vmz>n supjwya xdfdpd. gdod xem=zvm iuad xem=zr= iuad em=zr=z#ua lyoj jfywpj $ern =r= vzuad $ypj. gyaxe ibmuwpd em= r=>v. cvm>u xj$ykdwpd ae#= ibmuwpj xdad ywadlja gdod em=zrua gdaw$d xj$ykdwpem c=h=zs= ibmu.
xjajwxj gdod lqpjwpd em=zrn #=s= c=h=zs= >nmn. ibrn ruaw$dwxj fdlj lqpjwxj fqpy pdlez>n cvm>u dpy.
lehvrn svrnzrn cufj xe#=z#ua ly0jwpj $d$d $ypjwxj lehvrn =rvz>n inm>nmz>n & >nsvt=. sn iuoypjwpj ld$d dpd ywlehvrn =mcv. m=inmzmnzr= sn iuoypjwpj ld$d: $ypj lqpjwpj xjpja pqfd uwlehvrn =rv #v>v#vm rb#= ivhnz>n uadwxj $ypj lqpj. gdawajwpd odod lyojwpj gdodwpem rb#=z>nzsn sbcvzrn incvrvm vzm=inmz>n s=z>n #=in ibrn. m=inm ivhnz>n svrn ibrnzrnzrua pdodw$j $qgy. lehvrn =rnz>n & >nsvt=zrn hnrv c=h= i=s= svrn ibrn. iuoypjwpj fqgy ywaj$jwxj em= ivhnzr= iuoypjwxj ljaxja dpywxj & xj$ykdwpj dpdld yw$dwlja.
gdod xem=zvm iuad xem=zr= iuoypjwxj ljaxjawxj & xj$ykdwpj daxd ywxdfdpd. lehvrnz>n & >nsvt=zrn #v>v#vm sbcv vzrb#=zrn #v>v#vm rnrn vzh=h= =mcv.
cvm>u ljfywxj adaxd odpqwpd lehvrnzmnzrn sbinzruawd xj fdlj lqpjwxj pdlew>n cvm>u dpy. dpdwpj dagy. gdle cvm>u ljfy lqae r=zt=m iuadwodawpj fd$d gdod ljaxjawya ljaxjawpemzs= >nzs= mvm=z>n nmi= ibrn.
t=mzrn =m>= vzh=m vz=rv. =rvzr=zrn rua xe#= =mcv. t=m iuadwpj gdodw$dw$j. $jwpj gdodw$dwxj kda lem=. t=m sbcvzrn >nrnm n#v c=h=zsn. t=m sbcvzrn >nrnm upja gdodw$j. kda pjpjwpj xjpja ywlem=zrua xjw$j lehvrn. sn iuopm $qfywpj fd$dwke vz>nrnm >=#=r= vzsupjwpem cu#e ibrn sualqgd. lehvrn sbcv =mcvzrn #=s=ztu ywxem=z#nm ruaw#dwxj fdlj lqpjwgda =joyawya fjo yawxj fqpy dpdwxjwkda lem=zmn.
t=m iuad $qgywpjwpem s= =mcv vziuo ypjwxj & xj$ykd. gdle cvm>u lypjwxjw$d dpywpd odle r=h= n#vzvm h=iu $ern n#vzrn uxya ywgesu qgd gdod xdpj dpd gdod lehvrn >=rnz>n m=m>= rnrn. iuoypjwpj daxd yw$dwlja. $j lehvrnzrn c=h= nmi=zs=z>n #=in ibrn c=h= nmi=zs=z>n #nhvmzvm #nhvmz>n #brv =r=. sn iuoypjwpj fd$d. lem= sbcv. s=s= sntvzvm s=s= svrnzrn =m>=. c=iu gyaxe #=s=zr= iuoypj pjpjwpj lyoj! gyaxe mnz#nmzr= m=m>=z>n iuoypjw$jwpj lqpj! gyaxe y=s=zr= m=m>=zrn #=s= =r= rnrn. rb#= h=mzrnzrua qgd fjoyawxj jpe supj. pqfdwgqsoda pjwpemzs= >n sbinzvm supj dpd! gyaxez>n iuoypmwxj & xj$ykdw pjwlja!
This is without the line breaks – with them is even bigger and uglier than this! I used single line breaks for sections, and double line breaks for pages. For the third page, I assumed that the top two parts go together (first all of the left column, then all of the right) based on the sentence breaks.
In terms of words, it seems as though the spaces have been preserved from the original text (then again, this is just initial speculation). Also, 30 characters, plus a special one (perhaps a name, as Mr_Nabby proposed) – maybe 26 English letters, plus ö,ä,ü, and ß? The most common words are cvm>u, lehvrn, gdod, ibrn, svrnzrn, c=h=, iuoypjwpj, and gyaxe, in that order. No 18 letter words – I had hoped “Seeoahtlahmakaskay” would appear! Most common letters are =, n, d, j, and m, in that order. Then again, it won’t be as simple as a substitution cipher!
I want to say that in no way do I intend to take away from the artistic merit of the original . It is truly a work of art! But in order to do real work on deciphering it, we need a more accessible format.
I usualy start a code by looking for I or a, they are the only letters that can double as a word. So if you find a symbol on it’s own it is probably one of those two letters.
then move on the the most common usage of a three letter word which is usually ‘the’.
to help with translation the texts looks like it is written in some forms of runes
@ Kerilithia:
It’s not a simple substitution cipher. This needs to be addressed using serious methods.
Since I’m pretty sure I will win, I won’t join a group effort unless I get stuck. 😉
I’ve already spotted what I think is one clue nobody has mentioned.
@ Xezlec:
Make that TWO clues nobody has mentioned. I think. (One of them for sure.) Also found a statistical pattern of possible interest… assuming it’s not a red herring.
@ Xezlec:
Psh. Never mind. On second glance, that’s probably just ROT13 and anagrams and most likely has nothing to do with the manuscript. Back down to one clue I guess. OK I’ll shut up and go to bed now.
@ Xezlec:
“Fnaqen naq Jbb by Kernel River Zoo and Paid in Ruin”? Yeah, probably just some fun. I considered the ciphers at the top of the pages, too, but those are just for flavor, I think 🙂 They were around back in September and October last year, too. “In other news, great events cast their shadows (far) ahead. If you know where to look for them.” (from [0411] Piano Lessons)
@ Kerilithia:
not necessarily, since many languages dont use the word “the” (like latin) but instead just the word.
> “Ssssao fem Iha wl aiv bl olwv rised.” – Antonio Averlino [1465]
https://en.wikipedia.org/wiki/Filarete
Also, I think it is my imagination, but the first page has bold glyphs. Check the image comments too, if any where placed in the .jpeg files. I’d dislike to do statistical analysis for this. But Ryan did us a huge favor. I am with OrphanDidgeridoo on this one. (Even though I can pull sed(1) and wc(1) for this, maybe even some mawk). Too tired. Thanks Novil this awesome puzzle.
We might need a pastebin for this. Any idea where we can host the collaborative notes?
I find it amusing that after you explicitly said it’s not a simple monoalphabetic substitution, people are still chasing after the monoalphabetic substitution angle.
At least I have a better chance at this than the remaining parts of Kryptos. I’m pretty sure that one has information somewhere outside text that is probably necessary, from the parts people HAVE read.
Novil wrote:
So it sounds as though Novil put it through one encoding (encipherment) method, piped the result through another, probably piped that result through another and so on – but of course he’s not going to tell us how many different algorithms he used 🙂
However, I assume by the fact it is theoretically decipherable that Novil hasn’t used any one-way encryption algorithms.
Nice work everyone. I hope I can join in the deciphering this manuscript tonight.
However from the pictorial clues this is my rough prediction:
First page:
From previous S&W comic Sid mention about Yggdrasil tree. It seems like the one on the lower part of the tree is Ratatoskr? Then who’s the other one? It seems like Seeomatlakashkay stole a power or knowledge from the tree or from Ratatoskr himself. The bird is probably the unnamed Eagle that perched on top of the of the Yggdrassil tree and the wolf is Fenrir (I think this is a reason why in Woo universe everyone hate Wolves. They are the direct descendant of Fenrir).
Page 2
After escaping the wrath of Fenrir and the unnamed Eagle she went back to earth to share the knowledge(s) and power(s) she stole from the Yggdrassil tree to her subjects on earth.
Page 3
10.000 years later the Raccoon society created their own laws and cultures and start creating their own society and probably cities.
Page 4
Unfortunately the unnamed Eagle finally found out about the coon society. As a punishment for stealing knowledges and powers from the tree their cities are destroyed by the Eagle and they are sent back to the wild.
I have this feeling that this manuscript will give us hints that Raccoon speaking like human was actually pretty common back then.
If I am going to solve this, then I must ask: is it in english or german? If it’s in english, then I have a shot, but I don’t know any other languages so I’m screwed if it’s meant to be translated into german. 😛 You may answer this in your next post if you like.
I want to point out that it has exactly 31 characters and with the addition of a space this code could have some connection to 32 bit computing. I know she said it is realistically possible to solve this without some use of a computer, but it is possible to do 32 bit math without one
I’m guessing it’s some more of the “lighthearted” wolf hate propaganda designed to desensitize people to the campaigns to hunt and trap them everywhere they live in the US, which the strip authors have repeatedly shown they love so much.
@ AckAckAck:
@ swiftfox:
It’s always fun kicking someone when they’re down and taking the side of the murderous mob.
I did a bit of work and found something interesting.
I wrote a program to determine word frequency and ran the text of the Woo manuscript through it. It has a peak frequency of four letters, trailing off into an exponential curve BUT with a second, lesser peak at 7 letters and a third mini-peak at 9.
To compare with, I ran the program on three other texts: The entire text of Cory Doctorow’s Homeland, the first chapter of GK Chesterton’s The Club of Queer Trades, and the fourth chapter of Kafka’s The Trial (in German). They displayed the following characteristics:
Both English samples were almost exactly alike. The most common word length was three letters long (no surprises there). The curve then settles down into what looks like an exponential decay. Homeland was slightly meatier on 5- and 6-letter words than The Club of Queer Trades, but it doesn’t change the overall shape of the curve.
The German sample had a different curve. First off, German has no one-letter words (at all). It has far fewer two-letter words, but three-letter words are still the most common. The German text had a slight bump at the six-letter mark, as well as a notable bump at the 9-letter mark. The curve otherwise matches a similar decay pattern.
By comparison, the Woo curve has about half as many one-letter words as the english proportion; far fewer two-letter words than either English or German; WAY fewer three-letter words, with a peak at four-letter words instead; an exponential curve interrupted by two extra peaks. It looks kind of like the german curve, if it were shifted a bit to the right, but not entirely.
As such, I have the following conclusions:
1. The text is likely partially in both English and German.
2. The author has almost DEFINITELY messed around with word spacing or something that would create a similar effect. I suspect short words have been stuck to longer words or something similar.
A graph is here:
http://imgur.com/4N8KIS9
Homeland is in pencil. The Club of Queer Trades is in turquoise. Der Process (The Trial) is in blue. The Woo Manuscript is in black. I kind of ran out of pen colors at this point.
The program I used is in python, and is as follows:
import string
file = open("voynich.txt")
text = file.read()
file.close()
words = string.split(text,' ')
lengths = [len(word) for word in words]
maxlen = max(lengths)
if maxlen > 20:
maxlen = 20
wordfreqs = [lengths.count(x) for x in range(maxlen+1)]
print wordfreqs
print
print [float(x)/float(max(wordfreqs))*100 for x in wordfreqs]
There will always be 0s in the first line of the arrays output; this is for 0-letter words. Programming quirk.
The last line outputs the # of words of a certain length, divided by the # of the words that are most prevalent length. (so if there are 80 five-letter words and 300 three-letter words, the result will be 80/300 = 26%.) This is to help generate a graph like I did, with the top line as 100 and the bottom as 0.
Huh, I’ve never decoded anything in my life, but this looks like fun! And it doesn’t look that hard either. 🙂
The most commonly used letter in the alphabet is e, so I think I’ll start by finding the most common symbol. Then again I’m just assuming from the pattern that each symbol is a letter.
@ Kerilithia:
Huh, in Czech, that would be fun! Letters that can double as words in Czech are a (and), i (and also), o (about), u (at), v (in), z (from)
So… anybody fluent in the raccoon language? XD
I think I got something from the second page: Sly’s cane has gears in the backround, representing the instinct of stealth and ability to get into things most animals can’t. The coon with signal waves around it represents it’s sense of environmental awareness . I have no clue what the other symbol means, but it has to do with their hands and/or claws.
sadly i´m shitty at decoding, though i can invent codes just fine….anyway, i´d say the third illustration is about the raccoon-goddess advice/peace of wisdom ‘stay curious + follow your paws’ since the (part of) raccoon thats with the cage is death whereas the part with the woods is still alive. and the part about Yggdrasil is rather obvious…shit, i´m already dying for the translation, and hope we´ll get some background about the secret raccoon alphabet as well!
Now this looks like an interesting way to waste time.
About the only way I can think of to beat frequency analysis is to either use a continually changing cipher, or inventing a new language/grammar structure.
I doubt I’ll crack it, but I’ll give it a shot.
Hello, all!
So I’ve done the transcription over again, and by comparing the two and checking the original where discrepancies occur, I’ve created what I consider to be the “right” version. Note that the first one was actually missing a line, along with other scattered typos (I counted 15). Based on how many mistakes I had made both times, I can say that this one is perfect with 90% accuracy (and fewer than two mistakes with 99.5% accuracy!)
@ Ryan:
Wow, I go to sleep and when I wake up I find this impressive work! Good job.
I noticed that you named the symbol I mentioned earlier “&”. That COULD very much be just what that symbol acually means, can’t it?
@ ekimmak:
While it could be a continually changing cipher, that is pretty unlikely based on the minimal work I’ve done. There are many words and, more importantly, chunks of words that repeat. With a changing cipher, that would only occur when the loop in the cipher lines up with a repetition in the text – far too rare for what we’ve seen. My initial thought is that at least one layer of the encipherment is something like the word games “Gibberish” or “Ubbi dubbi” – which would account for the odd commonness of repeated bits in words.
@ Mr_Nabby:
Absolutely! I have no idea! It seems that there is a pattern in the following and preceding words – which we could interpret as binomial pairs (“salt and pepper”), supporting the “and” theory; or we could interpret as a common action for someone to make (“So-and-So brought justice”), supporting the name theory. Or something we haven’t thought of yet.
Very nice! Love the art on this one.
Concerning the text: I wish I had the time to take a look at that. Been a while since I did crypto stuff. Good luck to those who try it!
I’d like to add a couple points of data for our valiant code breakers working on this.
first, I want to point out that at the bottom of page 3 is what looks like either poetic repetition, or some kind of list (possibly of commandments?). you can tell by the repetition of the first word/first couple of words.
second I want to suggest that the symbol that resembles a “T” with a line over it (represented by ryan as “=”) be considered a vowel, because it appears very frequently in many words, including the three letter word at the beginning of the list I mentioned.
thirdly, on the first page there is a two letter word, followed by a four letter word and both words are in quotes. this suggests they are either a line of dialogue (unlikely but possible) or a name.
@ Alfa:
You may find that you have overestimated yourself… Novil worked really hard on this and I can’t see it being ‘easy’ to crack.
I think Ive got a link as to the first step in the decode…or rather the second. Its evil. Like really evil. I shouldnt even recognize this. AUGH! Now I must translate!!! I has no choice!!
I know nothing about decoding things, so I guess I just have to wait for someone else to do it. At least the story is easy enough to understand given the context of the goddess we learned about earlier.
Since so many people share their results, it doesn’t seem fair any more to just give the whole money to the one person who builds upon these results and has the right final idea. I will think about a different division of the prize money, maybe involving charities.
Still a better love story than Twilight
Woo, stupid, you should have kept it secret. That way, you could have attempted to sell it to the empress of Northia for 600 gold ducats.