log☇︎
228800+ entries in 0.14s
asciilifeform: http://usagl.com << from above, lulzy, old-school voice telecom co.
phf: asciilifeform: no, it's a command that you run, like REINDEX index_of_things; it simply queries what's already in DB and warms up the cash
asciilifeform: holy shit was that a ... user submission?!
deedbot: http://phuctor.nosuchlabs.com/gpgkey/A07FCCF0D46AC8B25EB8F0982629537817E0CEA47BCC6C8B800A06F4F4647160 << Recent Phuctorings. - Phuctored: 7 divides RSA Moduli belonging to 'Todd A. Outten <out...om>; ' (host-95.215.85.243.ongnet.ru. Unknown)
deedbot: http://phuctor.nosuchlabs.com/gpgkey/289FCBF68419984FD484C7EF7823AB7C114193224DD2733C7A60A20BC118F5A6 << Recent Phuctorings. - Phuctored: 7 divides RSA Moduli belonging to ' <spaf@mac.com>; <spa...du>; Gene Spafford <gene@spaf.us>; Gene Spafford <spaf@acm.org>; Gene Spafford <spaf@mac.com>; Gene Spafford <spa...du>; Gene Spafford <spa...rg>; Eugene H. Spafford <spaf@mac.com>; Eugene H. Spafford <spa...iz>; Gene Spafford <SPA...du>; Gene Spafford <spa...
deedbot: http://phuctor.nosuchlabs.com/gpgkey/0CCA49DE4C9967BFAE78ACF9D1AD438154B75B9700A055CA3DAC80F1714A2AA0 << Recent Phuctorings. - Phuctored: 7 divides RSA Moduli belonging to 'Connie Main <cma...om>; ' (host-95.215.85.243.ongnet.ru. Unknown)
Framedragger: docs say would need to get rebuilt only if there were any unwritten changes. which there shouldn't be as asciilifeform is not using write cache
asciilifeform: mircea_popescu: the other obvious thing would be to dispense with 'real time submission' entirely, and when someone dumps in a key, it goes into next batch. but we discussed this earlier in this thread, it would mean that the thing cannot be used as sks-like tool.
asciilifeform: this has to be done programmatically ?
phf: is it a hash index? it has the least overhead (it isn't logged amongs other things, so you have to rebuild it on crash, but conversly it's kept in memory and only supports = operation) indexes will make your queries more cheap, but writes more expensive, so you want to make sure it's the cheapest possible
asciilifeform: (and bernsteinization requires access to ~all~ moduli, as i think is obvious, and not simply 'most recent ones')
asciilifeform: mircea_popescu: but yes, for next version (presently only exists in my notebook) there is a nursery and it gets merged into main table at night. but this makes for considerably more complex system, where there are two very distinct types of submission, 'realtime' and 'scripted' , and they get treated quite differently.
asciilifeform: phf: there is
Framedragger: (and follow-up, does explain analyze show the use of that index)
asciilifeform: and no, you can't query the nursery every time somebody loads a url, or you get SAME performance as now, omfg
phf: a sort of impolite question, but is there's an index on hash column?
asciilifeform: well 1 db, 2 sets of key/fp/factor tables. ☟︎
mircea_popescu: no, have one phuctor with one db and two intakes.
asciilifeform: only the 'adult' db
asciilifeform: but what this adds up to is to have ~two~ quite separate phuctors. we wouldn't query the nursery, for instance, when someone keys in a url with a hash
asciilifeform: it is the most obvious unmassaged piece , aha. the correct algo is , imho, to have separate 'nursery' (gcism term of art) table for the batch submits.
mircea_popescu: this is a piddly excuse, "no duplicate logic", case of luser with wwwform and case of 100k keys in batch form are different enough to warrant duplicate logic. that's why computers even exist, to account for such level of difference in code.
mircea_popescu: yes, well, that's then the problem. they should go in as a single query the size of the batch, with the items sorted within it
asciilifeform: i dun give half a shit about 'image'. laying out the fact of why the thing is as it is.
mircea_popescu: would you stop with these bizarro deflections, they neither impress nor persuade, but they do give you an ugly image.
asciilifeform: that way there is exactly one procedural path for key submission, and no duplicate logic.
asciilifeform: they get thrown into same hole as if human submits.
mircea_popescu: uh. then why do you put the keys in in batches if you're not... putting them in in batches ?
asciilifeform: which is to say, rewrite of WHOLE thing. we had a thread.
a111: Logged on 2016-12-30 16:11 mircea_popescu: http://btcbase.org/log/2016-12-30#1593286 << actually workmem should be 256mb especially as you can afford it so totally, go for it.
mircea_popescu: asciilifeform if that's where it spends most time then a) http://btcbase.org/log/2016-12-30#1593462 is very likely to help and b) preparing your whole query as ONE single sorted item will help also. ☝︎
asciilifeform: mats: i especially loved the 1 single av signature offered
asciilifeform: (in fact, dumping out the entire db, and properly bignumizing, takes about 3min total for the current db.)
phf: asciilifeform: right, i was going to get to that :}
asciilifeform: these end up parsed into operable bignums every shot. but, surprisingly, this never takes > 3 minutes !
asciilifeform: when i profiled it, 99% of the time is spent in 'do we have this key hash? no? insert; do we have these fp's? no? insert...'
phf: i'm not even arguing with you, i'm saying that the ~full extent~ of what "move it to psql" is going to do is ~eliminate cross-boundary issue~ that is all. so it'll shave some significant overhead, but it's not a silver bullet.
asciilifeform: (or second, depending how to count)
asciilifeform: the current iteration is, iirc, the third from-the-ground rewrite.
asciilifeform: believe or not, i actually put some work into this thing
asciilifeform: that that's probably not it.
asciilifeform: phf: actually the wwwtronic piece of phuctor is in python and does the precompiled queries thing
phf: what i'm saying is that a significant fraction of "1000s of queries AND ..." is the cross-boundary. you compile queries on c side, you send them to psql, it then parses, prepares results, serializes, sends it to c side, c side has to now parse all over again
mircea_popescu: well, it was a thought.
asciilifeform: i think so?
asciilifeform: phf: what part of 'this isn't the bottleneck' was unclear
phf: well, a more practical approach would be to adapt phuctor c part to a postgresql loadable module interface. in which case you he will eliminate the cross-boundary overhead (serialize/deserialize over the "wire").
asciilifeform: mircea_popescu is seeing it through the naive vertically integrating rockefeller eyes, 'power plant expensive? let's put it right in my mansion'
asciilifeform: if it somehow had to happen inside postgres, it would not bypass the lock.
asciilifeform: understand, the only reason why the thing works at all, is that this one small part of it, the bernsteinization, can be made ~entirely~ independent from the db locking idiocy
asciilifeform: and not the bernsteining.
mircea_popescu: it has ~some~ ability to precompile your queries, which is somewhat like linking object code.
asciilifeform: because the actual bottleneck is '1000s of queries AND inserts / second AND guaranteed realtime consistent'
mircea_popescu: rather than in c.
asciilifeform: a sql or similar db system with built-in bignumatron could be useful and interesting. but no such thing exists. nor would it solve the actual bottleneck in phuctor if it were to be discovered tonight.
mircea_popescu: yes but it has this convenient hole through which you can go in, which is - implement bernstein IN sql.
asciilifeform: i for one do not expect to live long enough to make a serious attempt at such a thing.
mircea_popescu: which yes takes some work, but not quite as much as the other variant.
asciilifeform: mircea_popescu: if you think trb is a hard nut to crack, picture reading, grasping postgres.
mircea_popescu: bitcoin wants its own fs. ANOTHER way, is to use the means the db already offers for this.
mircea_popescu: you're not addressing the idea. currently you use a pile of c code you labeled for purely personal reasons "a db" to store some data for you, and another pile, you labeled phuctor, to bernstein and do other things on the db-stored data. because the interface is the bottleneck, it then becomes clear you must merge this. one way is to merge by lifting the db code and putting it into phuctor, making it you know, its own db like
asciilifeform: so that leaves 1 plausible explanation.
asciilifeform: and at this point it is imho unlikely that he has not heard of phuctor.
a111: Logged on 2016-12-30 16:28 mircea_popescu: but he might be interested to hear about it.
asciilifeform: http://btcbase.org/log/2016-12-30#1593516 << recall, i wrote to bernstein himself. ☝︎
asciilifeform: so it has not been a priority, because batching will tremendously complicate the moving parts.
asciilifeform: the querying of 'do-we-have-this-factor' is maybe 1% of the load.
phf: asciilifeform: i'm just trying to establish the dataflow here, for my own curiosity
mircea_popescu: and you do it as prepared queries, which get precompiled to a degree
asciilifeform: it is, by lightyears, the best known algo for batch gcd, also.
asciilifeform: by definition, that's what it does.
asciilifeform: there is no way around this.
trinque: asciilifeform: a temp table is in RAM
phf: asciilifeform: oh so you do insert to a set, every time there's a result, and you query for the whole set before you start a cycle of process?
asciilifeform: i need ~less~ access to db, not moar.
mircea_popescu: you implement bernstein IN the db. it is actually a programming language.
asciilifeform: so holy shit is this not screamingly obvious
asciilifeform: mircea_popescu: algo ~demands~ O(1) random access to the bignums.
mircea_popescu: but he might be interested to hear about it. ☟︎
mircea_popescu: though i am unaware anyone ever implemented this ; because, of coruse, i am unaware anyone used the guy's algo for any other purpose than gawking.
asciilifeform: this, by all rights, ought to be a batch query. and probably will be in next version.
asciilifeform: if answer is 'no', it is inserted in 'factors' table.
asciilifeform: phf: nope. the only thing that happens to db as a result of bernsteinization is N queries 'do we already know this factor'
phf: so you basically snapshot your entire dataset back into the database at certain times, and snapshot is an equivalent of set merge?
asciilifeform: the whole thing working at all is predicated on these seemingly 'abusive' design choices
asciilifeform: where i have O(1) access to them.
asciilifeform: have to live in MY data structure, in ram.
asciilifeform: so no, they can't 'live in db' while it happens
asciilifeform: trinque: i need random-access in O(1) to them for bernsteining
trinque: I am sadly, quite good at SQL if you want the thing translated
asciilifeform: oh and then, factors are found, largely the same set every time (how bernsteinization works) and each one is queried to the db
trinque: might be faster to do in the db
asciilifeform: this is easily 10% of the load on the db
asciilifeform: (the moduli have to turn into an array of bignum*)
asciilifeform: also did i mention that the entire db get shat out every time we bernstein ?
asciilifeform: (and, painfully, i had to find the offending garbage by hand!)
mircea_popescu: you can also set bgwriter_lru_maxpages to 0 and disable background writing altogether
asciilifeform: the db absolutely has to be in a consistent state at all times, or 0 phuctoring takes place.
mircea_popescu: eg bgwriter_delay you may want to be set low for this reason.
mircea_popescu: asciilifeform all these are memory usage ops, what they do is establish when it should go on disk. they do not significantly affect cord-yank robustness. there are other specs you can make for the background writer for instance that do.