31 May 2011

Decimation of children.

The number of children I have is drastically reducing....
I now only have 11.76 TB of unique data including myself ( 84996 unique files in 334 datasets).
In total their are only 732 datasets now.
The number of replicas is dramatically reducing, but some children are still popular.
# Reps |# Datasets
1 |236
2 | 38
3 | 10
4 | 24
5 | 5
6 | 6
7 | 3
8 | 1
9 | 1
10 | 2
12 | 1
16 | 1
20 | 1
22 | 1
24 | 1
26 | 1
27 | 1
28 | 1
IE now only 98/334 actually have replicas.

My birthday calendar has also changed.The new birthday calendar looks like:

This shows that other than the datasets produced with in the first 12 weeks of life; at least 659 out of 718 datasets from the last year have had all copies deleted. Also show that there has been a hive of activity and new datasets produced ( a re-processing) in time for the important "Moriond" Conference. Also shows that the majority of datasets are only useful for less than one year. (But it is noticeable the files first produced seemed to be the most long lived.

27 May 2011

At last - accurate tape accounting?

So it seems we now have accurate tape accounting - CMS have looked at the new numbers generated by the new CIP code and declared that it matches their expectations.

The code accounts for data compression as it goes to tape - and estimates the free space on a tape by assuming that other data going to the same tape will compress in the same ratio. Also, as requested by ATLAS, there is accounting also for the "unreachable" data, ie data which can't be read because the tape is currently disabled, or free space which can't be used because the tape is read-only.

All the difficult stuff should now be complete: the restructuring of the internal objects to make the code more maintainable, and the nearline (aka tape) accounting. Online accounting will stay as it is for now.