600+ entries in 0.014s
erlehmann: i later learned someone corrected all that stuff in the file sent to the printer by hand. their loss.
erlehmann: i think we submitted more changes than any other book by one or two orders of magnitude, just because we had it automated.
erlehmann: turns out a bit of ghostscript can fix that problem
erlehmann: but then drafts had to be sent as ODT and came back as PDF to be annotated. in hindsight that was probably a measure to prevent too many annotations.
erlehmann: we developed book using git and RST (because almost plaintext), with reviewers who also used git.
erlehmann: but i accidentally made someone hate me i guess
erlehmann: german o'reilly was different. worse, if you ask me.
erlehmann: in a chapter about evolution of traditional smilies to emoji
erlehmann: there are squares where cat faces should be
erlehmann: and they actually failed at astral plane unicode characters
erlehmann: asciilifeform you long for the days of codepage switching, i guess?
☟︎ erlehmann: koi8-r was the one where bitfuckery rearranged the cyrillic glyphs so that taking away a bit made it ASCII
erlehmann: i guess they „try“, as in keeping up appearances
erlehmann: i wrote „tries“ because authors are charlatans
erlehmann: want to write garbage? guess you can't
erlehmann: it makes it immediately clear what is part of the grammar and what not
erlehmann: one of the few things about urbit i like is that the system tries to not allow users to input statements not syntactically valid.
erlehmann: intersection of statements: no one is right. everyone is wrong.
erlehmann: mircea_popescu that is actually a major point from me. i tend to drink only with people who either proved they can behave appropriately or it would not bother me if they fucked me in the ass with a supersized dragon dildo while i'm too drunk to resist.
erlehmann: almost half of the code of epigraph is a recognizer for the input
erlehmann: mircea_popescu when i write software that considers the input language, i tend to write a grammar for the subset i am going to handle. so yes, i have written software that considers i.e. only characters valid in urlencoded base64 as input and rejects everything else.
erlehmann: my “those people are almost certainly charlatans” moment with urbit was when their software barfed on a character being more than one byte, despite claiming unicode support.
erlehmann: lots of software is broken beyond repair considering characters
erlehmann: at least that is for software i write. full recognition etc. pp.
erlehmann: about characters, if you start using shit i do not expect, prepare to fail at the input frontier.
erlehmann: mircea_popescu if i was not entirely sure she would never have gotten there.
erlehmann: let me say it like thit: a friend of mine once randomly started to masturbate beside me while visiting. i did not care as i was occupied otherwise, since she did not leave stains. if she had, i would have told her to clean it up.
erlehmann: mircea_popescu i negotiate such things beforehand. wants to live with me? must live up to my standards.
erlehmann: having delved into plan9 innards, i can say that some ill-thought abstractions make obvious shittery obvious, that is all.
erlehmann: mircea_popescu i tend to fuck people i can stand enough to live with, so … yes? unless i misunderstood that.
erlehmann: while codepages i use are gibberish to CJK
erlehmann: i tend to not waste brain cycles on codepages i do not use
erlehmann: asciilifeform expensive in what terms?
erlehmann: > estimates imply that every death of helmetless imbecile prevents or delays as many as 0.33 deaths among imbeciles on transplant waiting list
erlehmann: trinque if ${girl?} wanted to make a joke, she should know: APL is not the worst offender, merely the first. rules for identifiers in c# correspond to unicode standard annex 15.
erlehmann: i noticed once logs of another channel were gone
erlehmann: not in the mood for guessing history i'm not familiar with
erlehmann: large scale deployments are always driven by statistics. i once read that many stats have no bike helmet laws because lawmakers are convinced this would reduce biking to the point where more cardiovascular diseases outrun benefits of less head injuries.
erlehmann: process that produces them → people die from head injuries?
erlehmann: sounds like what a qualified cognitive-behavioural therapist would say indeed!
erlehmann: maybe an idiom i do not understand
erlehmann: mircea_popescu would you explain regulation of anxiety?
erlehmann: i created unifont glyphs using emacs and sed
erlehmann: asciilifeform what you think of unifont hex file format?
erlehmann: asciilifeform can you actually see ™ and ® btw?
erlehmann: mircea_popescu it's only war in europe that is made less profitable. lots of wars to shop around for otherwhere.
erlehmann: consider that idiots like hitler, napoleon etc. pp. found war more profitable than peace
erlehmann: i thought singular project is interconnection. make war less profitable than peace.
erlehmann: just pay your taxes and you'll be spared from war or worse!
erlehmann: mircea_popescu empire does not pretend equality, but tolerance it has. EU has 23 official languages AFAIK.
erlehmann: to me that is not tolerance, but annexation
erlehmann: unicode consortium sees group using not-unicode, swallows.
erlehmann: i think it is not about the literature. i think it is to cater to the special interest groups.
erlehmann: asciilifeform nichts hält so lange, wie ein provisorium!
erlehmann: and one introduced because unicode consortium was sick of other misbehaving standards bodies (ISO country codes), ha-ha!
erlehmann: i just see „DE“ surrounded by a special frame instead of a german flag
erlehmann: phf unifont actually leaves out shitstains like arbitrarily combining glyphs to make national flags.
erlehmann: it nicely complements what they already have out of the box
erlehmann: <Multi_key> <N> <S> <D> <A> <P> : "卐" U5350 # CJK UNIFIED IDEOGRAPH-5350
erlehmann: i suggest put the following in your .XCompose if you use X11
erlehmann: asciilifeform unicode has you covered
erlehmann: e.g. those who do not use ß / ẞ write “Der grosse Duden” instead of “Der große Duden”
erlehmann: in standard german, ß / ẞ / ss / SS are not ambiguous, but in german variants that do not use ß / ẞ, the ss / SS could be a digraph depending on context.
erlehmann: note that postprocessing is always wrong.
☟︎ erlehmann: ch does not exist as unicode ligature, but as i understand distinguishing characters is one of the problems unicode is at least intended to solve. as soon as everyone uses Æ for “ AE” the postprocessing is no longer needed.
erlehmann: instead of having to postprocess ASCII
erlehmann: well, if ch is one character, i rather have it as one codepoint.
erlehmann: one with only 25 members, seems sensible
erlehmann: character count and whitespace is locale specific? AFAIK whitespace is an ENUM
erlehmann: also, comparing two strings should be done bytewise, always. everything else is madness.
erlehmann: jurov converting case is locale-specific, therefore better left undone.
☟︎ erlehmann: which obviously is not enough (i read something about ~100k characters being enough)
erlehmann: in those days, only ~21k characters were reserved for CJK
erlehmann: version 1 of unicode was designed to keep under 16 bits
erlehmann: asciilifeform ok enjoy your latin1 while it lasts.
erlehmann: chinese, japanese, korean now share unihan, with all the processing complexity and technical debt incurred from that