27 September 2011

SRM speedup update

Or should that be speedupdate? If you remember the Jamboree last year, in Amsterdam, one of the suggestions to decrease the negotiation overheads in SRM by making more efficient use of the socket. Our very own Paul Millar from DESY has come up with a demonstrator using a lua shell which is able to reach SRMs by calling the functions in the API, like S2 does, but with perhaps a simpler language to learn, and you're in a shell.

What Paul demonstrated was the speedup associated with calling each function individually, and then by turning off GSI delegation, and finally by reusing the socket using HTTP KeepAlive. You'd not be surprised to see a big improvement - but of course the server must support KeepAlive.

Combined with the immediate return on srmGet when a file does not need staging, this could again speed up multiple file accesses (and of course you can still submit multiple file requests in a single SRM request.)

Paul has published the code, you can find the dCache LUA SRM interface.on the dCache web site.

14 September 2011

Bringing SeIUCCR to people

Am at the SeIUCCR (pronounced "succor" - no, not "sucker") summer school at the Coseners House in Abingdon and the doors are open out to the garden and we can well believe it is a summer school. Last time I lectured in a summer school (cloud security) I had made the presentation a bit too easy, so this time (data management) I included some hairy stuff. While it was basically about uploading data to the grid and moving it around, the presentation covered the NGS and GridPP, i.e. Globus and gLite, and we also (once) queried the information system directly (which was the aforementioned hairy part). But, like the 2-sphere, no talk can be hairy everywhere. Oh, and all the demos worked, despite being live.

The main idea is that grids extend the types of research people can do, because we enable managing and processing large volumes of data, so we are in a better position to cope with the famous "data deluge." Some people will be happy with the friendly front end in the NGS portal but we also demonstrated moving data from RAL to Glasgow (hooray for dteam) and to QMUL with, respectively, lcg-rep and FTS.

If you are a "normal" researcher (ie not a particle physicist :-)) you normally don't want to"waste time" learning grid data management, but the entry level tools are actually quite easy to get into, no worse than anything else you are using to move data. And the advanced tools are there if and when you eventually get to the stage where you need them, and not that hard to learn: a good way to get started is to go to GridPP and click the large friendly HELP button. NGS also has tutorials (and if you want more tutorials, let us know.)

It is worth mentioning that we like holding hands: one thing we have found in GridPP is that new users like to contact their local grid experts - which is also the point of having campus champions. We should have a study at the coming AHM. Makes it even easier to get started. You have no excuse. Resistance is futile.

05 September 2011

New GridPP DPM Tools release: now part of DPM.

I'm happy to announce the release of the next version of the GridPP DPM toolkit, which now includes some tools for complete integrity checking of a disk filesystem against the DPNS database.
This should also be able to checksum the files as well, although this takes a lot longer.

The bigger change is that the tools are now provided in the DPM repository, as the dpm-contrib-admintools package. Due to packaging constraints, this RPM installs the tools to /usr/bin, so male sure it is earlier in your path than the old /opt/lcg/bin path...

Richards would like to encourage other groups with useful DPM tools to contribute them to the repo.