c++ - Processing large amount of rows chunked / unbuffered -
consider reading through rows of large table mysql connector/c++:
std::unique_ptr<sql::resultset> res( stmt->executequery("select a, b table")); while (res->next()) { handle(res->getuint64(1), res->getdouble(2)); }
from documentation:
as of writing, mysql connector/c++ returns buffered results
statement
objects. buffered result sets cached on client. driver fetch data no matter how big result set is. future versions of connector expected return buffered , unbuffered results statement objects.
this in accordance observation. on smaller tables (~1e8 rows), takes 3 minutes before first handle
, rest completes in 7 seconds. larger tables (~1e10 rows), keeps gobbling more memory before runs out of memory.
how can such queries handled without running out of memory while being reasonably efficient concise code?
i must find hard believe there should no transparent solution. seems such obvious optimization chunked streaming of result within mysql layer.
note: handle
in library, faster mysql server can provide data. must process rows in natural order, cannot parallelized.
one way of doing it, not transparently, chunking on client, getting count of total rows , use limit keyword in query combined loop launch multiple queries chunk size has acceptable memory usage. you'd have sure table contents not changing in meantime able that.
Comments
Post a Comment