asciilifeform: http://btcbase.org/log/2019-03-11#1901354 << spyked - i was thinking, 'let's make torrent', then realized that torrent is some (afaik) largely unexplored heathenware, possibly due for a civilized replacement. might be worth expanding on if anyone has free hands.☝︎☟︎
asciilifeform: mircea_popescu et al : there are no cstrings in ada, unless one explicitly bakes'em in order to throw to c linked liquishit. all arrays carry their bounds with'em.
asciilifeform: ^ this goes for other folx! bring out thy irons.
asciilifeform: we also host owner-operated iron (e.g. dulap is still snsa ; and trinque has some, and mod6 )
asciilifeform: i'm reluctant to do the massive rk thing until we have a semblance of working gnat for arm
asciilifeform: mircea_popescu: possibly you have an iron that wants to go in the crate ?
asciilifeform: BingoBoingo: i'd ~really~ like to avoid the scenario where i go out with a half-empty crate
asciilifeform: BingoBoingo: currently hands full restarting ffa conveyor; however will be ordering irons in next 2 wks, and scheduling flight when the items with least predictable shipment windows are in hand
asciilifeform: i'ma detail, ftr : 'ffacalc' runs 'as fed', i.e. 1 command at a time. but 'peh' , adult version, has support for functions and loops, and therefore requires the 'tape' to exist in memory. so currently i have 'tape can be 1000000 bytes', but this is not acceptable obv. in the long term
asciilifeform: diana_coman: the 'errything on stack' approach has its limits; it is why i wrote the mmap thing (currently stuck in limbo , but i'ma have to revive it and fix, cuz ffa 17 also is hitting against this wall, you can't expect to put 100MB on stack, you gotta mmap it
asciilifeform: ( the style of programming that would appear on ' asciilifeform's ideal cpu ' is best illustrated in http://btcbase.org/patches/fg-genesis/tree/fg.v . i.e. all of the independent pieces in fact run in parallel, and in deterministic time, there are no interrupts, no scheduler, etc. )
asciilifeform: http://btcbase.org/log/2019-03-10#1901140 << the down side of 'let's 512b bus' is that most cpu time (where it runs, not counting idles on i/o here) is spent in 'inner loops' where yer counting to e.g. 3. and nao you gotta move 512bits when yer counting to... 3☝︎
asciilifeform: http://btcbase.org/log/2019-03-10#1901153 << bake 'pile of reconnectable flipflop' and then you aint gotta ever bake anyffin else again. iirc i detailed this in ancient thread, mircea_popescu barfed ( iirc answered 'why waste so many transistor on interconnects' ) , but can't currently dig up where we had this☝︎
asciilifeform: http://btcbase.org/log/2019-03-10#1901148 << imho the ( ~homogeneous~ variant of ) fpga is actually the correct model. i.e. you get to stitch it later into however many parallel mechanisms you happen to need on a given occasion.☝︎
asciilifeform: i dun expect to even live to see with own eyes a machine where 64bits of addr space fully populated (on current x64 Official standard, only 48 addr lines even connected, the rest mandatory 0)
asciilifeform: ( not even speaking here of 512^512... )
asciilifeform: ( at current or even hypothetical magic '1 atom per' density )
asciilifeform: http://btcbase.org/log/2019-03-10#1901108 << pipelineism, branchpredictionism, etc., all these heresies, were birthed from the fact that speed of memory fell massively behind that of cpu, in the time it takes to fetch ~anything~ you can do five digits of clock cycle☝︎
asciilifeform: the thing with gigantic multers is that they grow physically with the cube of the bitness. hence scarce. ( tho i dun imagine even a 8192bit single-cycle multer would be remotely near as heavy as the 3bil-transistor 'let's fry eggs' pentium-xxviii or whatnot )
asciilifeform: remember, only a terrorist(tm)(r) 'writes own crypto', 'good citizens use openssl' etc etc
asciilifeform: ( 'you want fast crypto, plebe? here, buy nsa-certified(tm)(r) cryptoaccelerator card from ibm' )
asciilifeform: but no prizes for guessing why it aint on the market.
asciilifeform: ftr an 'iron ffa' cpu does not even require a massive multiplier . even a microcoded ffa-style thing that lets you specify 'and at memory x there is a w-word int, and at y a w word int, add'em' etc , would still massively win over the extant liquishit, it would do the arithm atomically, without invoking branchpredictor, losing cache, etc.
asciilifeform: ... so 'mulx' aint in anyffing i have. if someone wants to test it with own hands, he can, otherwise fughetit.
asciilifeform: and meanwhile2, we have answr to above quandary, 'In 2017, BMI2 was further incorporated in AMD's Zen-architecture...'
asciilifeform meanwhile found today mistake in 17 , and expects it'll take several days to rewrite
asciilifeform: ftr asciilifeform suspects that 99% of what can be won from asmism in ffa, can be had simply from bvt's existing 64bit mul, plus doing adc for the additions-with-carry instead of the manually-cranked carry calc, and that all 'fancy' instructions will only lead to sad
asciilifeform: fwiw it's still a 64bit mul, the only win is that it dun set any flags (and therefore keeps the pipe flowing)
asciilifeform: bvt: the only new instrs that seem to be even theoretically of use, are 'mulx' and 'adcx' -- but i dun have any iron that supports these atm, and cannot even begin to say whether constant time etc
asciilifeform: where yes it'll do a e.g. 512-bit add, but -- evidently -- using ~existing~ regs, and putting out same (or greater) amt of heat, and locking up the pipe
asciilifeform: i.e. they can't stuff any moar transistors in there, and end up offering the equiv. of ye olde 'winmodem'
asciilifeform: it is such a retarded design that even intel and microshit ~tried to escape~, in 1990s. but nsa decided that it ~likes~ x86 , and for it to remain cemented standard , on acct of http://www.loper-os.org/?p=1299 .
asciilifeform: aaand this is not even to mention their seekrit 'optimizations' behind the scenes.
asciilifeform: when you build 1 of these things, there's a set of decisions that end up determining shape of whole thing; and it so happens that intel made ~all~ of the most retarded possible choices.☟︎☟︎
asciilifeform: ( and this is not even touching the subj of the tlb cache, which is ~1/3 to half of those transistors , which we have on acct of the idjit paging scheme )
asciilifeform: if you ever wonder why your x64 iron draws 50x the wattage to do same thing as e.g. rk, wonder no longer -- the insanity where shit gets moved around to accomodate idjit instructions with fixed in/out hoppers, the insanity where you gotta set prefixes to specify what width ~each operand~ is (why this is needed ? srsly) , all of this adds up to 3bil transistors that heat the room☟︎
asciilifeform: btw, bvt , rax etc. ~are~ encoded as 1-8, the iron dun see reg names at all, the classic names are a convention of the asmers and the vendor docs. and imho remains on acct of the asinine x86isms like MUL which use fixed input and output regs, makes'em slightly easier to remember.☟︎