tag:blogger.com,1999:blog-8584629468471803075.post4941347629741809180..comments2024-01-31T12:30:28.282+00:00Comments on Blasted Bioinformatics!?: How not to deal with NGS data - MrFast & MrsFastPeter Cockhttp://www.blogger.com/profile/00233221181317137855noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-8584629468471803075.post-58597075776815565732013-01-29T14:11:01.456+00:002013-01-29T14:11:01.456+00:00> Personally I would use the term "lazy lo...> Personally I would use the term "lazy loading" for tasks where you may never need to look at a piece of data. <br /><br />That a case where on-demand resource provisioning is very good: the client asks for a piece of data but never actually look at it (false promise).<br /><br />> Here with read mapping, the program will need to look at every read eventually.<br /><br />If I am not mistaken, bwa is quite efficient and loads individual chunks of data from input files. From the outside, this can be seen as lazy loading too. It works well because objects are touched sequentially.<br /><br />We use lazy loading in Ray. It's basically just a class that sits between a data file and a operator class and that provides on-demand data.<br /><br /><br />According to Wikipedia, these are the 4 type of lazy loading:<br /><br />> lazy initialization; <br /><br />object construction is deferred until first utilization<br /><br />> a virtual proxy; <br /><br />pretty much like lazy initialization, but with a object wrapping the true object using the same interface<br /><br />> a ghost<br /><br />uses a partial state -- this can be used to load partially a fastq file<br /><br />> a value holder<br /><br />a object to which lazy loading is deleguated<br /><br />It's also related somehow to copy-on-write (except that here it's more like copy-on-read)<br /><br />http://en.wikipedia.org/wiki/Copy-on-write<br /><br /><br />You don't provide the resource unless the customer/client requests it. Other objects can remain in the warehouse (fastq files).sebhtmlhttps://www.blogger.com/profile/03840557846651053123noreply@blogger.comtag:blogger.com,1999:blog-8584629468471803075.post-91810729120945546282013-01-28T18:42:59.398+00:002013-01-28T18:42:59.398+00:00Ideally yes, although all to often in bioinformati...Ideally yes, although all to often in bioinformatics this kind of software optimisation isn't done in the original code base :(Peter Cockhttps://www.blogger.com/profile/00233221181317137855noreply@blogger.comtag:blogger.com,1999:blog-8584629468471803075.post-42698417997431259742013-01-28T18:41:52.645+00:002013-01-28T18:41:52.645+00:00Glad you like the blog :)
Personally I would use ...Glad you like the blog :)<br /><br />Personally I would use the term "lazy loading" for tasks where you may never need to look at a piece of data. Here with read mapping, the program will need to look at every read eventually.Peter Cockhttps://www.blogger.com/profile/00233221181317137855noreply@blogger.comtag:blogger.com,1999:blog-8584629468471803075.post-50879519027631558272013-01-28T18:04:19.662+00:002013-01-28T18:04:19.662+00:00> Since the tools are currently single-threaded...> Since the tools are currently single-threaded, they break up their read files and run multiple instances in order to take advantage of multiple cores. <br /><br />This part should be done by the software tool, not by its end user.sebhtmlhttps://www.blogger.com/profile/03840557846651053123noreply@blogger.comtag:blogger.com,1999:blog-8584629468471803075.post-8746517113841386272013-01-28T18:00:14.115+00:002013-01-28T18:00:14.115+00:00Hi,
I really like your blog.
> Instead where ...Hi,<br /><br />I really like your blog.<br /><br />> Instead where possible you loop over a file (iterating over it record by record) or employ indexed random access.<br /><br />There is a design pattern called "Lazy loading" for that. <br /><br />http://en.wikipedia.org/wiki/Lazy_loadingsebhtmlhttps://www.blogger.com/profile/03840557846651053123noreply@blogger.com