Have you ever heard of the Voynich manuscript? It’s a mysterious book, possibly created in the early 15th century, that contains weird illustrations and text written in an unknown script and language. Since its rediscovery in 1912 by Wilfrid Voynich, it has eluded the decipherment attempts of generations of cryptographers. The Voynich manuscript is a fascinating piece of history that has inspired many novels, games and films. Amateur cryptographers can find the latest news and research on the Voynich manuscript and other uncracked ciphers on Nick Pelling’s blog Cipher Mysteries. He’s also the author of the readable non-fiction book The Curse of the Voynich.
To celebrate the publication of the 500th Sandra and Woo strip, I have decided to publish “my own Voynich manuscript”. So here it is, The Book of Woo! As you can see, it resembles the Voynich manuscript in several ways. But of course we couldn’t create 240 pages, 4 had to be enough. Unlike the Voynich manuscript, The Book of Woo definitely contains sensible information that can be deciphered. I guarantee it ;-). And I will pay the person who is able to provide a decipherment that’s sufficiently close to the plain text a reward of $500. Send your decipherment attempt(s) to novil@gmx.de. I would also love to hear about your general ideas or statistical analyses that you carried out. There is no deadline. I will not publish the solution until at least strip #1000.
But be warned: It’s a huge challenge and I don’t expect to receive a valid decipherment at all. It’s primarily a work of art, not a puzzle for the general public. I believe that only experienced and dedicated code breakers have the chance to succeed. A lot of time was spent on the encryption. If you think you can simply carry out a frequency analysis on the letters and be able to reconstruct the English or German plain text this way, well, that’s just a waste of time. However, to make things a little easier, I want to give you the following hints:
- The encryption isn’t based on an algorithm only suitable for computers which executes a loop 100 times or something like that.
- The encryption isn’t based on some sort of device or mechanism that is hard to get.
- No “classical” steganographic method was used since that would just be impossibly hard to crack.
- The plain text is some sort of literature, as one can guess from Woo’s comment and the illustrations. A lot of time went into the plain text as well, it’s not just a copy of the first page of Rascal or something like that.
You can download larger versions of the four pages of the Book of Woo here:
[Update: 10 August 2013] Everybody who is seriously interested in deciphering The Book of Woo should read the comment section. There is a lot of interesting information in it.
[Update: 31 March 2015] The Book of Woo Wiki, now maintained by our reader Chris, also contains valuabable information for anyone who’s trying to break the code. In case the wiki should go offline sometime in the future, I created a complete backup of the wiki’s content on 31 March 2015.
In other news, the winners of the Sandra and Woo and Gaia fanart contest 2013 have been posted.
Thanks to everyone who participated!
- Sandra: Hey, Woo, what are you writing?
- Woo: Oh, just a little story.
- Sandra: Really? Can I have a look?
- Woo: Sure, I’ve just finished it.
- Sandra: What in Voynich’s name…?!
|
@ AckAckAck:
Step aside everyone, genius coming through!
What a great idea, Novil! 🙂
If only I had seen this yesterday, now I feel like everybody has a headstart. ^^
Also:
Sandra and Woo at Tanagra.
The beast at Tanagra.
Shaka, when the walls fell.
Sandra and Woo on the ocean.
— Ben
@ Ben:
I see what you did there…
@ Ben:
I’ll catch you up to speed.
Novil hinted, and statistics eventually confirmed that there is more than just a substitution encryption in play here. there are 32 different symbols in the book of woo, and no two identical symbols appear next to each other. some people think that the second level of encryption is based on dummy letters that don’t mean anything (which would explain the extra letters in the alphabet), put there in order to throw us off. others think that some of the extra letters could represent either the extra letters that appear in the german alphabet, and the rest could be represent capitalized letters or names.
the main theory right now is that the code is a vigenere cipher, an explanation of which can be found here: http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher .
while this looked promising at first, its initial proposer and one of our best cryptanalysts, phlosioneer, has found several inconsistencies that are unexplained as of yet. also Novil seems to be dropping a lot of hints that we’re getting cold, so I’m beginning to think vigenere cipher isn’t the right way to go.
well, now you know everything I know, including what i’ve been doing for the last 72 hours…
By the way, have you ever considered the possibility that someone might just hack inot your computer and steal the original manuscript rather than decode this?
To the genius team that creates this wonderful comic: my hat would be off to you if I owned one. ^^ I am not a great cryptographer, and I won’t be making an attempt to decode this, but I admire you for creating such a wonderful challenge.
My only request is that you share the outcome with your adoring public. ^^
Could it be SsssaofemIhawlaivblolwvrisedAntonioAverlino? That has the right letter count.
@ RegentKyre:
Oops, 43 letters.
PsidmssidVjjcpktmrojgdThomasJeffersonBeale has the right letter count though.
Could you make this website friendly to off–the–road mode, thanks in advance a loyal reader and fan.
Finally, some math I can do:
If this is strip 500 and you update twice weekly, assuming there’s no special stuffs, that means we’ll get the solution in approximately 250 weeks, which is 1750 days or nearly 5 (more specifically, 4.8611) years. Yikes.
quite a few comments hidden due to to many dislikes,the amount of competitive people is high in this one
Ok, I am stumped, so I’ll post this here: By looking at the distribution of character pairs I have discovered that the alphabet can be cleanly separated into two parts, with “u” and “e” serving as separators (incidentally, they look like on and off switch) between two sets. Sets seem identical in function. For example, “z” in one subset is equivalent to “w” in other. The distribution of separators doesn’t suggest that they are spaces of any kind, so I replaced the equivalent letters, and created a “reduced” manuscript, now with only 15 letters (I ignore “&” for now). It also has a curious separation of pairs (for example, only 6 symbols may end the word). So now I stuck trying to make at least 26 symbols out of 15 (obviously, at least pairs are involved, or maybe some kind of variable length code – all doable on pen and paper). In the meanwhile, enjoy the connection table (if it comes out right in comments):
= v n / b m # r s i h > t c z
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
= 1 0 0 0 0 0 1 1 1 1 1 1 1 0 0 1
v 1 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1
n 1 0 0 0 0 0 1 1 1 1 0 1 0 1 1 1
/ 1 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1
b 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1
m 1 1 1 1 1 0 0 0 0 0 1 0 1 0 1 1
# 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
r 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
s 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
i 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
h 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
> 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0
t 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0
c 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0
z 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 0
@ Satsuoni:
That… is huge. That explains so much!
What if the order of letters in words is reversed (ekil siht)? or if the last letter of every word has been switched with the first letter of the next word?
@ Satsuoni: You must not want the $250 very much to just post this huge discovery here for all to see.
Beregorn wrote:
Then the matrix would be transposed, mainly. As one can see, that here is almost a perfect bipartite graph, with only m and z out of place (the first row/column is space). “z” seems to separate groups off the main body of the word, and right now I think that it may reset the “state” of the decoder. “m” is strange. If one were to assume that the first 6 (including m) symbols are terminals, one would get exactly 39 groups, with single “m” being the majority (z excluded).
@ Novil: I stared at this matrix for 2 days already before posting… I am kind of stuck. Also, I can live without $250, and I want this thing solved. By the way, this is why I asked about a typo on 4th page – there is a single “w” in text that I think should have been “z”, it really stands out. Since it was the single letter out of about 2500 I assumed that it was missing the slash that makes it “z”.
Anyway, if I were to somehow win this, I have no idea how you’d get the money to me 🙂 We’ll see.
At least I think this is real progress and not red herring… I hope.
P.S. Sorry for weird english, lack of sleep is messing with my brains.
@ Tynan:
not all of it is in German. some of it is English.
I just realized… You reached your 500th milestone at about the same time Brawl in the Family did…
Is there a key to find first or no key?
@ Satsuoni:
Very cool work!
Perhaps the u and e serve as markers. This is only a pet theory at the moment but perhaps the alphabet is split at the 15 character mark and then switched to the “higher” alphabet after the presence of one of these switch characters. If thats true one of the characters can’t hold all the combinations.
Another equally compelling version in my mind would be that there are 13 letters and those two oddball ones you have z and m combinations represent the alphabet order from the top to the bottom 13 letters.
food for thought anyways. I would love to see your reduced letter version 🙂
the german alphabet has 30 letters, if you include diacritics. this has 30 letters, if you just say the ampersand (&) is a special letter, just as ryan pointed out.
there is a good chance it is written in german. therefore, we should use the german alphabet instead of the english one for that. i’ve just done a find and replace on ryan’s transcription (IT IS SO HELPUL, THANK YOU), and put it on my dropbox.
link to german alphabet text: https://dl.dropboxusercontent.com/u/30964123/Documents/The%20Book%20of%20Woo/edited%20CypherText.pages
link to character frequencies list: https://dl.dropboxusercontent.com/u/30964123/Documents/The%20Book%20of%20Woo/char%20freqencies.pages
@ Satsuoni:
Woo! Good job, and thanks for posting! I’ll have to see how this blends with grammatical analysis; perhaps the mysterious ABAB, ABCB, ABCBDBE, and ABCBDEB patterns can be explained. The two-letters-mean-one-letter idea would also explain the abnormally high letter-per-word ratio.
@ Mac Johnson:
Thanks for the compliment, but I’m not convinced I’m one of the best – Jamie, Tahrey, Ryan, and now Satsuoni have come up with the most ideas and have even published progress; I’ve simply advocated one way of tackling the problem and have yet to publish results. Plus, there’s the mystery person who sent Novil something, plus other cryptographers lurking in the shadows, either working privately or watching these comments but not adding to the conversation. (Which is perfectly fine – to each his own.)
Also, I didn’t propose the Vigenere Cipher, that was Jamie. At one point I did mention that I didn’t think letter frequency was even enough for it, but that’s all.
@Novil
I can’t speak for anyone else, but I’m not in it for the money; I’m in it for the practice, the fun, and the story!
i’ve shifted the letters by a number using a program here. just in case a shift was put on top of anything else in there.
link to shifted: https://dl.dropboxusercontent.com/u/30964123/Documents/The%20Book%20of%20Woo/ciphertexts%2C%20shifted.zip
i also did some basic stat analysis on it, spreadsheet here: https://dl.dropboxusercontent.com/u/30964123/Documents/The%20Book%20of%20Woo/character%20frequencies%20using%20German.numbers
the frequency of the letters doesnt get anywhere near german (well, according to the stats on wikipedia :P), so its clear that some work has been done to change the character frequencies. vigenere, likely. i’ve run the alphabet (well, my version of it, removing all punctuation and ampersands, and changing ryan’s =,$,# etc into german letters) through some automated vigenere solvers, but nothing has been found, so im looking for some other shifts.
secondarily, there may be a transposition, but thats just a gut feeling, as i use them a lot for quick things.
I merged all double posts so that the comments fit into two pages once again.
@Phlosioneer: That is not a fair comparison. As an example, I read the comic in German, but I post here as it is pointless to split the analysis.
@Satsuoni: Great to see that you studied that. That is much more structure than I expected. That also removes the observed higher entropy per letter.
How did you correlate the two sets (I can see z/w as they have a special function, but what about r, s, i and so on?)?
Well, first I noticed that excluding z,w and switches m and a are the only letters to break bipartition (see above), leaving me with letters divided into 2 subsets by number of connections. Then I noticed that replacing the matched letters in one subset leads to parts of the words equal to the words in another subset (“suffixes” in particular) and went from there. I am pretty sure it was correct, but it won’t hurt to doublecheck I guess… Here is the list:
u : e a : m z : w > : x n : j v : y h : o # : f s : $ = : d c : g r : p i : l t : k
b : q
@ mfb:
Good point. I hadn’t accounted for our bilingual members.
@Novil
NOOOOOO! I had comment #200, too. 🙁
@ Satsuoni:
I can’t say whether your comparison is correct or not since I changed a bunch of the characters to get rid of symbols, but I just used the “X & Y” to get 90% of the matches.
Hi all, nice to meet you. I’ve only been lurking because I’m supposed to be preparing for my qualifiying exam, but this puzzle is sucking me in.
@ Satsuoni:
After you mentioned that u and e seem to switch between subsets of the alphabet, I worked out the same set of pairings, so I’m pretty sure that’s correct.
Then I replaced all the letters of one set with the other, and took the z/w as separating “syllables” (I don’t know if that’s what they’re actually doing. It looks interesting, certainly.). But there’s this one construction at P1, S4 that seems strange to me ( | is my separator, where z was):
cvm> c=h= cvm> c=h=|s= …
It doesn’t seem like a common English word pattern to me. Does it look familiar to anyone who knows German? Or maybe Phlosioneer’s grammatical analysis of this sentence would help.
@ biocuriousgeorgie:
Hm, you forgot the transfer symbol. I am pretty sure it doesn’t go away, as in, it means something beside switching alphabets. It is in my matrix above as “/”. The “symmetry” kind of breaks without it. (Not a guarantee that it doesn’t go away, but still)
Here is the whole sentence in my rendition: “cvm>/ c=h= cvm>/ c=h=zs= rvcvzr= /m= svrnzrn rb#nm vz#vmz>n s/rnzvm >=#=r=”
And yes, still no idea how to code those 15-ish (counting service) symbols into 26 or 30 alphabet symbols. I tried to check the breakdown scheme (break each symbol into 1, 2 or 3 symbols and then gather them together), but the words do not break into equal blocks that way (unless they are padded by, say, 0-s), so that was a dead end, seemingly.
So far, most of the text seems to be organized in letter pairs, with the exception of m and z, z always preceding a “code group”, and m being attached wherever, as far as I can tell 🙁 Also, for whatever reason, if “v” starts the word, it is always followed by “z” if there is more than one letter (there is one single letter “v” in text). Again, as far as I can tell…
Also, I have thesis defense next week, so I should really stop, but I can’t 🙁
P.S. I am not posting the whole transcript because I am greedy 🙂 Also, I have given the character map anyway. Those who want it, can simply replace all symbols in the transcript from the first page.
@ biocuriousgeorgie:
Welcome to the comments! Anyway, I haven’t been able to tackle it yet because of work, but I use a cipher site to look up simple word patterns for english. It’s also handy for finding words of different lengths, because you can type in “abc” and get most three-letter words. (link) I haven’t found a german version of the same, though if it is german I won’t be able to crack it anyway, so I haven’t been looking.
@ Satsuoni:
I found two instances near the bottom where v started the “word” but was not followed by the separator.
@ Charles:
If such words exist, they are not in my transcription… or I can’t find them. Could you write them out, please?
@ Satsuoni:
Last line on the third page, 3rd word “va_xn”
First line on the fourth page, 7th word “vrn_xn”
@ Charles:
Ah, found that. In my version they don’t have spaces between them. If you look at the original image, you’ll see that the position is mildly ambiguous because of the “y” letter, and I am relatively sure that there is no space. Then the word on the 4th page becomes vzi/hvrnz>n (/ is the switch), which is vz+first word of text+ z>n (second most common “suffix”, 41 occurrences in text)
Any ideas on how to transform text? Or any tool to help with grammar analysis – not something I can do on my own?
Right now I am getting crazier: I have separated “m” into two symbols – the one with properties of prefix letter and the one that always occurs after code group. 60 code groups, 1 to 3 letters each, 27 z-suffixes. I think I am overthinking things…
There might be also some clue in the letter shapes (the original letters), actually, but no idea what. This is based on the tenuous fact that the switches look like switches on electronics.
@ Satsuoni:
I’ll agree with you on both of those. The transcript I used incorrectly put spaces there.
I know that everyone is working hard on this puzzle, and I wish you luck.
Seeing as I however cannot contribute to the process, I will just sit here and say that that is one of the prettiest drawings I have seen for quite a while. It seriously looks like it could actually be a part of a real-life manuscript. Bravo to the artist!
@ Phlosioneer:
BTW, if this thing IS based on German after all, as a non-native (non-)speaker I feel obliged to point out the existence of verbs with separable prefixes in the language – a bit similar to words like “take off” in English in that they’re two “words” forming a meaning together, but different and possibly trickier in that they can be both separated (in finite forms) and not (infinitive).
Most of this decoding talk goes way over my head, so I doubt I have any chance, but I can try and supply little ideas of philological kind…
Your analysis of the missing double characters is interesting from a philological point of view, because this is surely a striking feature in English or German, but perfectly natural in Czech (my native tongue), where double characters are very rare. (In the case of Czech, length is substituted with diacritics for vowels, but that’s clearly not the case here.) Given the small number of characters discussed by others, if there’s any sort of letter-substitution going on here at all, I think hand-picked one-on-one substitutions are not very likely.
I must end my ramblings here, because it is very, very late and I should turn off the computer.
I think I found an ampersand or some kind of short form; This character appears only individually and sporatically. It makes sense to be that it would be an “and” symbol which is common in European languages but it could be something more exotic or a short hand for something.
http://s1285.photobucket.com/user/LanvalLinguistics/media/Ampersan_zpse91b64d6.png.html
Also the repetition here seems critical; http://i1285.photobucket.com/albums/a585/LanvalLinguistics/Repetition_zps663c051a.png.html
Considering the subject matter, I suspect the first word in each is “Coon”/”Racoon” [with the “oo” being a single letter] but it could be birds or an important religious figure. Regardless, it may have to do with the picture which seems to show their escape into the wilderness away from captivity.
I suspect as a whole, it’s initially a Promethian tale of forbidden knowledge shared with the Racoons by their God [Edenic Fruit Symbolism], Capitivity/Escape and then Exile [hence the ruined photo of the garden]. It’s probably a story of how they learnt to climb and/or pick locks.
My links did not complete;
“The Tironian note”/Possible Ampersand:
http://s1285.photobucket.com/user/LanvalLinguistics/media/Ampersan_zpse91b64d6.png.html
The Verse/Repetitive Line;
http://s1285.photobucket.com/user/LanvalLinguistics/media/Repetition_zps663c051a.png.html
A second point here is that it appears to be Noun Initial, rather than Verb-Initial as I think I saw someone imply. It is probably S-V-O, though I guess O-V-S is possible.
The lack of short initial verb implies one of three things;
1. The Author is using Present or Perfect Tense English as opposed to Past [Unlikely in my Books]
2. The Author has developed a coded Tense system/short hand where Auxiliary verbs are eliminated [Possible but that would make it very, very hard. Can’t rule it out, at least]
3. The Author is using German [which to my limited knowledge keeps a single word for past tense]. I think that to a german speaker, the above passage may give some critical clues.
Should these links not post again; I am refering to the four sentances under the drawing of the Racoon split between a skeleton in a cage and living in the forest. And the Tironian Note/Ampersand Symbol is two dots under what could pass as an ornate P.
@ Satsuoni:
Stupid enter key.
Thanks for the snipet of reduced text.
I was playing around with an idea basically using 13 of the letters in your reduced version and using the last two as switches between a “high” range and “low” range alphabet. The idea works sorta well I think but I probably used the wrong letters to switch.
The resulting text is totally gibberish but even if Novil used every other letter in this method or broke it up like i did with a-m being the low range and n-z being the “high” range. We can always use one of those online ceasarian shift tools to figure out the correct letters later. I was using z and m to switch between the alphabets but I think that may be a little flawed.
Link to a ceasarian solver. I usually try different phrases to make sure i didn’t get one with names in it. http://rumkin.com/tools/cipher/cryptogram-solver.php
Below I’m going to post an example of what I tried maybe someone else will have an “aha” moment and get the right letters.
Low Range
= v n / b m # r s i h > t c z
a b c d e f g h i j k l m
high range
= v n / b m # r s i h > t c z
n o p q r s t u v w x y z
original
cvm>/ c=h= cvm>/ c=h=zs= rvcvzr= /m= svrnzrn rb#nm vz#vmz>n s/rnzvm >=#=r=
new version
zokd maja mbxq znwnha tozoga dn uotoga gefc ofbxp (incomplete and messed with)
I made this assumption on the chart you showed us earlier.
As I’m writing this I just had another moment of possibility.
The letter groups are split into two major sections as you can see. One section is 8 letters long the other is 5. So you could even split up the alphabet into 4 groups. Two 8 letter groups following one of the two other letters m or z. And the other being two groups of 5 letters following the other letter probably m since it has the least number of pairs associated with it.
Thoughts?
@ Jamie:
Hmm the comments mangled my letter ranges so i’m posting it again without the switches z and m
Original Alphabet
= v n / b m # r s i h > t c z
Low Range ( might be switched of z or m or both?!)
= v n / b # r s i h > t c
a b c d e f g h i j k l m
high range
= v n / b # r s i h > t c
n o p q r s t u v w x y z
Lanval wrote:
This does not make sense to me. It might be because I missed out on something further up in the comments; but in what way does simple Past Tense not form a S-V-O sentence?
“I wrote a comment.”
You may be mixing that up with questions-forming.
@ Marmota:
It looks like a conclusion based on the number of letters per word. Which is completely unknown right now.
Another day, still no results 🙁 Nevil, did that progress you mentioned come to fruition yet?
Anyway, tried to see if this cipher is like the cipher where each letter is broken in two, and the adjacent halves are merged. Doesn’t seem to be.
I also doubt it is a vigenere cipher, at least not on the symbol level. Perhaps if the symbols were converted to letters somehow…
There is curiously little info on the internet about the ciphers that use different alphabets (as in, alphabets with a different number of letters), except numerical ones.
I looked into m being a switch symbol. It is possible, but not very likely. It certainly does something, though.
As far as I can tell from the distribution of unique symbol pairs, part of the 4th page, at least, uses either different language. That is not robust, though 🙂
Sigh… I should really stop. Anyway, this is my whole current version of reduced transcription, m is split into two symbols, “m” and “~”. Maybe somebody could use it.
i/hvrn svrnzrn ~nsn vz~n: i/hvrn svrnzrn c/#n >/#=z#/m ivhn. h=mzr= svrn ibrnzr= /~=zrnzt/ v >nsvt=. cvm>/ ~=m>= h=rb in#v >n~nzr= /~=zrn />vmzr/mzrn #/m ivhnzrn #bcv vz#/m ivhn. #/m ivhnzrn c/#v vz~n: /z=r=i= vzs= s=s=zinm. /~= svrnzrn >=~= vz~nsn “sn c=h=” c=h=zt=m i=s= =rv i=s=zrn c=h=zc=mzs= >n ib~/z>n >n~n =r=. cvm>/ c=h= cvm>/ c=h=zs= rvcvzr= /~= svrnzrn rb#nm vz#vmz>n s/rnzvm >=#=r=. c=h= >/~=zvm i/~= >/~=zr= i/~= /~=zr=z#/m ivhn n#vzrn s/rn =r= vz/~= svrn. cvm>/ ib~/zr= /~= r=>v. cvm>/ >nsvt=zr= ~/#= ib~/zrn >=~= vz~=inm c=h= /~=zr/m c=mzs= >nsvt=zr/m c=h=zs= ib~/.
>n~nz>n c=h= ibrnzr= /~=zrn #=s= c=h=zs= >n~n. s= >n~nzrn #=in ibrn r/mzs=z>n #=in ibrnz>n #brv r=i/z>n cvm>/ =rv.
i/hvrn svrnzrn c/#n >/#=z#/m ivhnzrn s=s= svrnz>n i/hvrn =rvz>n inm>nmz>n >nsvt=. sn i/hvrnzrn i=s= =r= vzi/hvrn =mcv. ~=inmz~nzr= sn i/hvrnzrn i=s=: svrn ibrnzrn >nrnm rb#= vzi/hvrn =rv #v>v#vm rb#= ivhnz>n /~=z>n svrn ibrn. c=mz~nzr= h=h= ivhnzrn c=h=zr/m rb#=z>nzsn sbcvzrn incvrvm vz~=inmz>n s=z>n #=in ibrn. ~=inm ivhnz>n svrn ibrnzrnzr/m r=h=zsn sbcv. i/hvrn =rnz>n >nsvt=zrn hnrv c=h= i=s= svrn ibrn. i/hvrnzrn #bcv vz~nsnz>n /~= ivhnzr= i/hvrnz>n inm>nm =rvz>n >nsvt=zrn =r=i= vzs=zinm.
c=h= >/~=zvm i/~= >/~=zr= i/hvrnz>n inm>nmz>n >nsvt=zrn =m>= vz>=#=r=. i/hvrnz>n >nsvt=zrn #v>v#vm sbcv vzrb#=zrn #v>v#vm rnrn vzh=h= =mcv.
cvm>/ in#vz>n ~=m>= h=rbzr= i/hvrnz~nzrn sbinzr/mzs= >n #=in ibrnz>n r=i/z>n cvm>/ =rv. =r=zrn =mcv. c=i/ cvm>/ in#v ib~/ r=zt=m i/~=zh=mzrn #=s= c=h= inm>nmzvm inm>nmzr/mzs= >nzs= ~v~=z>n nmi= ibrn.
t=mzrn =m>= vzh=m vz=rv. =rvzr=zrn r/m >/#= =mcv.
t=m i/~=zrn c=h=zs=zsn. snzrn c=h=zs=z>n t=m i/~=.
t=m sbcvzrn >nrnm n#v c=h=zsn.
t=m sbcvzrn >nrnm /rnm c=h=zsn. t=m rnrnzrn >nrnm vzi/~=zr/m >nzsn i/hvrn. sn i/hvrn sbcvzrn #=s=zt/ vz>nrnm >=#=r= vzs/rnzr/m c/s/ ibrn s/mibc=. i/hvrn sbcv =mcvzrn #=s=zt/ vz>/~=z#nm r/mzs=z>n #=in ibrnzc=m #nhvmzvm #nhvmz>n #brv =r=z>nzt=m i/~=z~n.
t=m i/~= sbcvzrnzr/m s= =mcv vzi/hvrnz>n >nsvt=. c=i/ cvm>/ ivrnz>nzs= =rvzr= h=i/ r=h= n#vzvm h=i/ s/rn n#vzrn />vm vzc/s/ bc= c=h= >=rn =r= c=h= i/hvrn >=rnz>n ~=m>= rnrn. i/hvrnzrn =m>= vzs=zinm. sn i/hvrnzrn c=h= nmi=zs=z>n #=in ibrn c=h= nmi=zs=z>n #nhvmzvm #nhvmz>n #brv =r=. sn i/hvrnzrn #=s=. i/~= sbcv. s=s= sntvzvm s=s= svrnzrn =m>=. c=i/ cvm>/ #=s=zr= i/hvrn rnrnzrn ivhn! cvm>/ ~nz#nmzr= ~=m>=z>n i/hvrnzsnzrn ibrn! cvm>/ #=s=zr= ~=m>=zrn #=s= =r= rnrn. rb#= h=mzrnzr/m bc= #nhvmz>n nr/ s/rn. rb#=zc=zh=m rnzr/mzs= >n sbinzvm s/rn =r=! cvm>/z>n i/hvrnz>n >nsvt=zrnzinm!
@ Satsuoni:
Judging by the amount of deviation, I’m under the impression there’s AT LEAST one more level of encryption before any substitution can be made. Since you figured out the first layer, maybe you can figure out the second? 😀
Nice. I paired the letters based on the digraph frequencies and got the same result as your word analysis (just at the pairs of “s#ci”, I was not sure).
Jamie proposed a Caesar chiffre: even if we identify letters of the source code that cannot work, as our alphabet is arbitrary. We have more than 26 ways to assign them (26!=26*25*…*1).
Problematic things:
The mentioned w which should be a z – I treat it as z.
We have two “o” ending a word in the transcription. In both cases, the next word begins with a “y”, and I think the space between the letters could be a regular spacing within a word. To fit the pattern, I removed the space in both cases.
We have one “o”, followed by a “p”. I can confirm the transcription, but that is the only deviation from the nice pattern, so it might be an error in the comic (similar to the one w). I ignored that issue.
A “w” ends the next to last word, it is the only occurence of that. I have no idea what to do with that.
Further analysis:
The switch symbols show the same digraph pattern as =nvb. I think that it is a regular letter, and switching between the two alphabets itself does not mean anything (we find the same words in both alphabets).
This leaves 3 groups: One with m/z (special characters), one group of 5 symbols, and one group of 8 symbols. To improve readability, I replaced the 5 symbols with vowels, m with x and z with y and the other 8 symbols with consonants from b to p. This produces text like
“lamu lixdu mibeydeyga abiyba famu bafa ekiyix famu”
I uploaded the result: http://en.pastebin.ca/2428118 (I had the option to encrypt that… :D)
Translation table:
m z = n v b r > h s # c i t e u
x y a e i o b d f g k l m p u u
Way easier to read than the original transcription, and I think it still has the full information content. All following letters refer to that modified text.
As building blocks, I get:
– 32 combinations consonant+vowel (6 of them are not used, two are used just once) (where twice the same combination can occur)
– single x/y (where xy and yx are possible)
– single vowels (but not two after each other)
We have “fafa” in the code, and there is no English or German word with two times a single letter and nothing else. Therefore, the 32 combinations are not just single letters or our words are not words.
Well… I made a digraph table of those 33 building blocks.
All building blocks except x/y prefer some other building blocks before/after them. As an example, “ba” occurs 39 times: It begins a word 6 times, it follows a “y” 19 times, “a” 11 times, “ka” 3 times, and nothing else.
“fi” occurs 29 times, 5 times as “kefix” and 24 times as “mufibe” – and nothing else is used in front of or behind “fi”.
Looks like we need a new idea.
@ Satsuoni:
I don’t know whether this will help, or if it’s even relevant, but…
One of the character sets I’ve been working on for a fictional universe is built around the idea of a base-16 number system, with phoenetic “letters” that are written as two-digit numbers, a pattern that toggles things between “letters” mode and “numbers” mode, and an actual zero expressed as a double of the zero character because the zero is the “escape” character that’s the first part of both the toggle and of all mathematical operators . Punctuation, accents, and other symbols, including spaces, are also expressible as two-digit numbers.
I don’t know if what we’re seeing here is anything remotely like that, but it wouldn’t surprise me. And no, my character set looks nothing like Novil’s, nor to my knowledge has he seen it, nor have I seen anything related to the encoding of his.
I will also suggest that if two-character “letters” are in fact that case, that some of the combined symbols might be represent sounds that in English or German are expressed with two characters as well, even if most of them represent sounds that in English or German are expressed with only a single character.
Alternatively, we could be looking at a language like Hawaiian which has a much smaller character set than Latin. Or an “abjad” like Hebrew or Arabic where vowels are either modifiers to letters which are consonants only, or are omitted entirely.
I just felt the urge to say “challenge accepted”. I wonder why…
@ Marmota:
I think I might have misread someone as implying that Novil shuffled the order of words in the language in order to make it trickier, writing out Verbs First, then the rest of the sentance. I think the repetition just implies a sentance order common to English and German that the Subject is initial. I was likely over thinking the cipher there.
Well, the work week has started and I have little to no time to spend on the cracking 🙁 And I still need to at least read my thesis before presentation <_< It has been a while. So, no real progress to speak off.
However, I have received a letter from Nivel confirming my findings, above. (aka the double alphabet , switch retainment thing and character mapping). So there is that.
@DanialArin: As I mentioned above, I have investigated possibility of recoding the text into equal-length codewords. Doesn't seem to work. The text does have syllabic structure not dissimilar to engrish (but not similar, either. Just that you can fit the groups into the kana alphabet neatly) , but that is it. Mind you, I am by no stretch of imagination a professional cryptanalyst, so I may be wrong.
Well, maybe later I'll list the cipher types that can be. No time now.
Alright, I’m now 99% sure there’s a typo in the second-to-last paragraph, 5th line, 2nd word after the the period.
If it’s not a typo, everything breaks down 🙁
@ Charles:
Since Novil (sorry for typo above) confirmed that my mappings are correct, that means that that single “w” is indeed “z”