asciilifeform: but i suspect this aint what these had in mind
asciilifeform: at one pt i experimented with, among other lulz, playing suspect-rng via headphone , to find regularities
asciilifeform: does it involve ear-wilting whine in headphone ?
asciilifeform: ( was mega-fad in msdos days -- proggy that prompts 'which dope', you select one, and then listen to 'hypnotic' squeals from headphone )
asciilifeform: '... "hypnosis game" that "makes you" transsexual ?' << recall the 1990s idjits with 'audio drug' ?
asciilifeform: ( would still run on a 1cpu box, with slight penalty, tho, cuz unix scheduler )
asciilifeform: ( typically '101% cpu' when at full throttle )
asciilifeform: btw thing already sometimes spills into 2 cpu (there are slave threads for uart and clock)
asciilifeform: right, but how wouldja parallelize w/out typing up >1 pc cpu
asciilifeform: http://btcbase.org/log/2019-07-25#1924597 << closest thing to this would be where you actually emulate the traditional mips pipeline and have a thread per stage (e.g. inst decode, fetch, execute ) but this would be, if you think about it , ~even worse~ than current, tie up 5 cpu cores to emulate 1 little mips...☝︎
asciilifeform recognizes that this won't make a lick of sense to anyone who hasn't read the thing. putting here mainly for log.
asciilifeform: ( short spoiler : if you ensure that neither tlb nor 'asid' has changed since last tlb lookup, and current vaddr's tag is same as during last lookup, can use same pfn (i.e. tlb lookup result) as then. )☟︎
asciilifeform meanwhile found method to speed up 'M' by ~30% (possibly moar), will vpatch.
asciilifeform: saw ( tho can't recall from what was linked )
asciilifeform: neato, will have to eat (when... 'after the war')
asciilifeform: mp_en_viaje: how about old man de sade ?
asciilifeform: mp_en_viaje: x86 is a mountain of sad of which most folx (even such as write proggies) see only the very tip. but goes all the way down into magma.
asciilifeform: relatedly, i can't fathom why bellard et al decided to simulate gnarly existing physical irons , rather than patching kernel
asciilifeform: BingoBoingo: it aint as if there were no support for mips in the classic kernel tho. ( i only had to add 100ln or so specifically to bring up the simplified sim-devices )
asciilifeform: mp_en_viaje: thinking about it moar, i suspect that it is possible to bake a less screamingly retarded version of bellardism , where, e.g., the 'cache' is prefilled with unconditional jumps to the sim-instructions, eliminating the decode hit
asciilifeform: all of this is 1) exceedingly gnarly , from 'fits in head' pov 2) requires a writable-and-executable memory segment. i.e. massive open wound.
asciilifeform: and as it goes, tries to keep cache full, does equiv. of 'prefetch' (if this were a normal iron cache), etc
asciilifeform: they used 'dynamic recompile' method, where the sim simulates a kind of cache. when sim-instruction fetched, it looks whether found in this cache, and if not, emits a chunk of x86olade into said segment of cache, which corresponds to 'compiled' ver. of that instr.
asciilifeform: for thread-completeness, i'ma summarize how bellard et al did it
asciilifeform: in the actual instrs. per se (see 'mipsinst' dir.) , used conditional mov's wherever could think of how, so as to also keep pipeline full
asciilifeform: seems like there is only so much 'silk purse' that can be made from this particular sow's ear.
asciilifeform: best i could think of , was to arrange the mov.../and.../shr... etc so as to occupy the amd64 pipeline properly, in each inst. form.
asciilifeform: ( at least not on my opteron, on 2014+ theoretically could, but i dun buy those )
asciilifeform: in the beginning, thought i could at least implement the mips instruction operand decode using xmm. but ha! not that either.
asciilifeform: (i.e. cannot add/sub/mul or even shift entire xmm reg)
asciilifeform: to add insult to injury, asciilifeform knows how to represent 'x bit lookup from x*k-bit string representing table' via arithmetic methods -- but! amd64 dun let you do full arithmetic on xmm either☟︎
asciilifeform: ( for n00bz / tlbism (aka mmu) is how os gives individual userland process the illusion of 'infinite memory, starting from addr 0', paging, etc )
asciilifeform: kernel-mode coad btw bypasses the tlb, so runs with the expected ~40x slowdown.
asciilifeform: mp_en_viaje: well, on my opteron, they're 128bit regs (on moar recent -- 256 and even 512!) but only the lower 64 is directly addressable. and gives buncha 'simd' ops, e.g. 'shuffles' where 'take erry 7th bit and add'em' etc. but looked and looked an' found nothing that'd correspond to table lookup
asciilifeform: if bellard were here, would prolly cackle.
asciilifeform: world's most retarded machine arch, the pc is.
asciilifeform: originally was at least gonna park the tlb ~in~ the xmm regs. but guesswat, there aint even a way, in 1 clock cycle, to get anything other than low 64bit out of'em !
asciilifeform: http://btcbase.org/log/2019-07-24#1924463 << on the actual iron, it happens in 1 clock cycle, cuz entirely parallel. in 'm', ends up iterating through all 16 tlb entries. i looked into using 'simd' instructions to do it, but they are ~artfully~ useless , ended up using the xmm regs strictly as fast temp storage;☝︎
asciilifeform: grr phf is it gonna take 2 months for the thing to be in btcbase.org/patches so i can link to it ?!☟︎
asciilifeform: the way i was gonna implement the nic, is to pipe packets via linux's 'tap/tun' into a slot in the 'm' mmio space (see bus.asm) and generate slave irq (see irq.asm) . if these arrive faster than the irq can be eaten, they simply get dropped.
asciilifeform might even be wrong, and it is conceivable that 'dhrystone of pentium166 but i/o of opteron' actually suffices to e.g. host blogs.☟︎
asciilifeform: ( for that matter -- if e.g. mp_en_viaje one day bakes actual silicon cpu , would run on par with pc, with unchanged soft. )
asciilifeform: to round off that thread -- asciilifeform strongly suspects that kernel would run 100-200x faster on the 10 $ 'ice40' than on opteron-cum-'M'
asciilifeform: conceivably there is a use for e.g. cuntoo that loads from memory snapshot in <3sec but runs 'like pentium 166'
asciilifeform: mod6: you can safely skip it, it's a ~null result
asciilifeform: mp_en_viaje, i predict, will cackle, say, 'toldja, dummkopf'
asciilifeform: mod6: i was gonna genesis a kernel for that thing, but nao after sad benchmark thinkin', it's of ~0 use outside of very specialized applications☝︎☝︎☟︎
asciilifeform: ^ figure on properly unloaded box, vs. earlier; ~= 252.3x slower than host on same benchm.
asciilifeform: 'Dhrystones per Second: 28112.8' on same.
asciilifeform: ( if anyone finds -- plox to write in. )
asciilifeform: to be pedantically exact : 66.35 'bogomips' (on 3GHz opteron) , and published table gives e.g. 'Pentium/166 66.36'. i was unable to find similar table of classics for 'dhrystone' .