BerkeleyDB 笔记 - 王朝网络宽屏版

1. 如果db.open时指定了DB_THREAD标志，则在调用db.get时，Dbt类型的对象（key不必，data必须）必须指定DB_DBT_MALLOC, DB_DBT_REALLOC 或者 DB_DBT_USERMEM flag中的一个，用于给返回的记录值分配空间，否则会get失败，cursor的get方法则没有这个限制，这个问题调了半天。

2. 二级索引数据库的回调函数提取二级索引的关键字时特别要注意，不能将主数据库的data域提取到栈上再去赋给二级索引的关键字，只能直接从参数pdata中取地址，跟踪源代码时才发现。

关于pagesize和cachesize

3. page大小官方文档建议设置成文件系统的block size大小，而cache的大小可以考虑设置大些，如果是小型应用可以估计下数据库的总大小再将cache设置成该大小，如果是大型应用要考虑实际测试效果

4.Because databases opened within Berkeley DB environments use the cache specified to the environment, it is an error to attempt to set a cache in a database created within an environment.

因为在数据库环境中打开的数据库使用的是环境指定的cache的大小，所以在数据库环境中创建的数据库尝试去设置cache的大小将会失败。

5. The Db::set_cachesize method may not be called after the Db::open method is called.

Db::set_cachesize方法不能在Db::open调用之后调用。

6. The Db::set_pagesize method may not be called after the Db::open method is called. If the database already exists when Db::open is called, the information specified to Db::set_pagesize will be ignored.

Db::set_pagesize方法不能在Db::open调用完以后调用。如果当调用Db::open时数据库已经存在（就是说打开一个已经存在的数据库重新指定它的pagesize将无效），那么调用Db::set_pagesize所指定的信息将被忽略。

There are several issues to consider when selecting a pagesize: overflow record sizes, locking, I/O efficiency, and recoverability.

1. pagesize太小，那么那些大小超过pagesize的记录将存放到一些叫做overflow page的页中，overflow page是存在于标准的数据库结构之外的，存取效率更低，所以pagesize太小将导致大量overflow page出现，会严重影响程序性能

2. pagesize太大的话，同样影响程序性能，因为Berkeley DB的lock的粒度是以page为单位的（queue存储模式例外，queue是记录级锁，而非页级锁），所以pagesize太大的话，将导致线程或进程的竞争增多，影响性能。

For this reason, you want to select a page size that is at least large enough to hold multiple

entries given the expected average size of your database entries. In BTree's case, for best

results select a page size that can hold at least 4 such entries.

lock子系统，和transaction子系统学习中，核心部分也是难度最大的部分。