I recently needed to dig into Picasa's internal databases to get some information that it appeared to store only there, and not finding the answer on the interwebs, here are my notes about their format. Please do let me know if you have more information about this file format.
The notes are for the Mac OS, Picasa version 3.9.0.522.
The database files are found under
$HOME/Library/Application Support/Google/Picasa3/db3
on the Macs, and there are equivalent locations on other
platforms. Under here are a set of files with a .pmp
suffix,
which are the database files.
[BTW: The files with the .db
suffix just hold thumbnails of
various groups of images. They are in the standard windows thumbs.db
format, and here's
a link that has more useful information about this format.]
Each .pmp
file represents a field in a table, and the table is
identified by a common prefix as follows:
$ ls -1 catdata_* catdata_0 catdata_catpri.pmp catdata_name.pmp catdata_state.pmp
The file with the _0
suffix is a marker file to identify the
table, and each .pmp
file sharing that prefix is a field for
that table. For instance, catdata_state.pmp
contains records
for the field state
in the table catdata
, and so
forth.
All files start with the four magic bytes: 0xcd 0xcc 0xcc 0x3f
The marker files (ie, files that end in _0
) only contain the
magic bytes.
The pmp file is in little-endian format rather than the usual network byte/big-endian format.
There are several areas where I just see constants -- I don't know the purpose of these and I'll list them out. Please note: all values are presented in little-endian format, so if you hex-dump a file, you should see the bytes reversed.
Header
4bytes: magic: 0x3fcccccd
2bytes: field-type: unsigned short.
2bytes: 0x1332 -- constant.
4bytes: 0x00000002 -- constant.
2bytes: field-type: unsigned short -- identical with field-type above.
2bytes: 0x1332 -- constant.
4bytes: number-of-entries: unsigned int.
Following the header are "number-of-entries" records, whose format depends on the field-type. The field-type values are:
0x0: null-terminated strings. I haven't tested how (if at all) it can store unicode.
0x1: unsigned integers, 4 bytes.
0x2: dates, 8 bytes as a double. The date is represented in Microsoft's Variant Time format. The 8 bytes are a double, and the value is the number of days from midnight Dec 30, 1899. Fractional values are fractions of a day, so for instance, 3.25 represents 6:00 A.M. on January 2, 1900. While negative values are legitimate in the Microsoft format and indicates days prior to Dec 30, 1899, the Picasa user interface currently prevents dates older than Dec 31, 1903 from being used.
0x3: byte field, 1 unsigned byte.
0x4: unsigned long, 8bytes.
0x5: unsigned short, 2bytes.
0x6: null-terminated string. (possibly csv strings?)
0x7: unsigned int, 4 bytes.
The entities are indexed by their record number in each file. Ie,
fetching the 7273'rd record in all files named imagedata_*pmp
gives information about the fields for entity #7273 in the
imagedata
table.
You might expect every "field file" for a given table to contain the same number of records, but this is not always the case. I expect the underlying library returns the equivalent of undefined when fetching fields for a record beyond the "end" of any given field file.
Finally, a small java program to dump out whatever information I've gathered thus far. Compile, and run against a set of .pmp files.
Here is a sample run.
$ javac -g -d . Read.java $ java Read "$HOME/Library/Application Support/Google/Picasa3/db3/catdata_name.pmp" /Users/kbs/Library/Application Support/Google/Picasa3/db3/catdata_name.pmp:type=0 nentries: 10 [0] Labels [1] Projects (internal) [2] Folders on Disk [3] iPhoto Library [4] Web Albums [5] Web Drive [6] Exported Pictures [7] Other Stuff [8] Hidden Folders [9] People
And here's the code.
import java.io.*; import java.util.*; public class Read { public static void main(String args[]) throws Exception { for (int i=0;i <args.length; i++) { doit(args[i]); } } private final static void doit(String p) throws Exception { DataInputStream din = new DataInputStream (new BufferedInputStream (new FileInputStream(p))); dump(din, p); din.close(); } private final static void dump(DataInputStream din, String path) throws Exception { // header long magic = readUnsignedInt(din); if (magic != 0x3fcccccd) { throw new IOException("Failed magic1 "+Long.toString(magic,16)); } int type = readUnsignedShort(din); System.out.println(path+":type="+Integer.toString(type, 16)); if ((magic=readUnsignedShort(din)) != 0x1332) { throw new IOException("Failed magic2 "+Long.toString(magic,16)); } if ((magic=readUnsignedInt(din)) != 0x2) { throw new IOException("Failed magic3 "+Long.toString(magic,16)); } if ((magic=readUnsignedShort(din)) != type) { throw new IOException("Failed repeat type "+ Long.toString(magic,16)); } if ((magic=readUnsignedShort(din)) != 0x1332) { throw new IOException("Failed magic4 "+Long.toString(magic,16)); } long v = readUnsignedInt(din); System.out.println("nentries: "+v); // records. if (type == 0) { dumpStringField(din,v); } else if (type == 0x1) { dump4byteField(din,v); } else if (type == 0x2) { dumpDateField(din,v); } else if (type == 0x3) { dumpByteField(din, v); } else if (type == 0x4) { dump8byteField(din, v); } else if (type == 0x5) { dump2byteField(din,v); } else if (type == 0x6) { dumpStringField(din,v); } else if (type == 0x7) { dump4byteField(din,v); } else { throw new IOException("Unknown type: "+Integer.toString(type,16)); } } private final static void dumpStringField(DataInputStream din, long ne) throws IOException { for (long i=0; i<ne; i++) { String v = getString(din); System.out.println("["+i+"] "+v); } } private final static void dumpByteField(DataInputStream din, long ne) throws IOException { for (long i=0; i<ne; i++) { int v = din.readUnsignedByte(); System.out.println("["+i+"] "+v); } } private final static void dump2byteField(DataInputStream din, long ne) throws IOException { for (long idx=0; idx<ne; idx++) { int v = readUnsignedShort(din); System.out.println("["+idx+"] "+v); } } private final static void dump4byteField(DataInputStream din, long ne) throws IOException { for (long idx=0; idx<ne; idx++) { long v = readUnsignedInt(din); System.out.println("["+idx+"] "+v); } } private final static void dump8byteField(DataInputStream din, long ne) throws IOException { int[] bytes = new int[8]; for (long idx=0;idx<ne; idx++) { for (int i=0; i<8; i++) { bytes[i] = din.readUnsignedByte(); } System.out.print("["+idx+"] "); for (int i=7; i>=0; i--) { String x = Integer.toString(bytes[i],16); if (x.length() == 1) { System.out.print("0"); } System.out.print(x); } System.out.println(); } } private final static void dumpDateField(DataInputStream din, long ne) throws IOException { int[] bytes = new int[8]; for (long idx=0;idx<ne; idx++) { long ld = 0; for (int i=0; i<8; i++) { bytes[i] = din.readUnsignedByte(); long tmp = bytes[i]; tmp <<= (8*i); ld += tmp; } System.out.print("["+idx+"] "); for (int i=7; i>=0; i--) { String x = Integer.toString(bytes[i],16); if (x.length() == 1) { //System.out.print("0"); } //System.out.print(x); } //System.out.print(" "); double d = Double.longBitsToDouble(ld); //System.out.print(d); //System.out.print(" "); // days past unix epoch. d -= 25569d; long ut = Math.round(d*86400l*1000l); System.out.println(new Date(ut)); } } private final static String getString(DataInputStream din) throws IOException { StringBuffer sb = new StringBuffer(); int c; while((c = din.read()) != 0) { sb.append((char)c); } return sb.toString(); } private final static int readUnsignedShort(DataInputStream din) throws IOException { int ch1 = din.read(); int ch2 = din.read(); if ((ch1 | ch2) < 0) throw new EOFException(); return ((ch2<<8) + ch1<<0); } private final static long readUnsignedInt(DataInputStream din) throws IOException { int ch1 = din.read(); int ch2 = din.read(); int ch3 = din.read(); int ch4 = din.read(); if ((ch1 | ch2 | ch3 | ch4) < 0) throw new EOFException(); long ret = (((long)ch4)<<24) + (((long)ch3)<<16) + (((long)ch2)<<8) + (((long)ch1)<<0); return ret; } }