MyDeDubMyDeDub is a data deduplicating NBD server.
In the current state, the program is merely a proof of concept.
The special part of this program is that it uses a MySQL database to store the data.
As MyDeDub works on the block-level, it is filesystem agnostig. So you can create ext3, xfs, jfs, any filesystem you like on it. Tested with ext-2 -3 and 4, and OCFS2 (a clustered filesystem - you need to disable the MyDeDub cache for that).
NBD is short for 'network block device'. It is comparable to iSCSI and FCoE. For allmost all platforms NBD-servers are available. NBD-clients are in the Linux, HURD and Solaris kernel.
DownloadMyDeDub-0.4.jar - now also handles arbitrary block size, some speed improvements
mdd-0.4.tgz - source
Please note that at least version 0.4 won't work with openjdk, please use the sun java6 jdk.
How to use itFor it to work you need a database-server and a server to run MyDeDub on. If a system has enough resources (ram, cpu), these two can be combined.
Installation on MySQLCreate a database and grant a user INSERT, SELECT, UPDATE and DELETE rights. (e.g.: grant insert, select, update, delete on mdd.* to mdduser@'%' identified by 'mddpass').
Create these tables:
CREATE TABLE `blockmap` ( `sector` bigint(12) NOT NULL DEFAULT '-1', `blockid` bigint(12) NOT NULL DEFAULT '-1', PRIMARY KEY (`sector`), KEY `bi` (`blockid`) ); CREATE TABLE `config` ( `name` varchar(255) NOT NULL, `value` varchar(255) NOT NULL, PRIMARY KEY (`name`) ); CREATE TABLE `data` ( `blockid` bigint(12) NOT NULL auto_increment, `data` blob NOT NULL, PRIMARY KEY (`blockid`), KEY `data` (`data`(16)) );You migtht want to use InnoDB tables instead of MyISAM. This gives less performance buy you can be more sure that your data is on disk when the database server crashes (myisam does no explicit fsync after each write).
In the 'config'-table, put a record with name='size' and value is the size (in bytes) of your storage. Also add a record with name 'block_size' and value '4096'.
Running itThen invoke the program with the following parameters:
--db-url jdbc:mysql://localhost:3306/mdd --db-user mdduser --db-pass mddpass --port 12345
You might need to tweak the parts written in bold.
The port is the port to which the nbd-client connects.
java -cp /usr/share/java/mysql-connector-java.jar:MyDeDub.jar MyDeDub \ --db-url jdbc:mysql://localhost:3306/mdd --db-user mdduser \ --db-pass mddpass --port 2209/usr/share/java/mysql-connector-java.jar is the default location on Debian systems for the MySQL JDBC connector. On other systems (e.g. RedHat) this location might be different.
The sever that will use the device will do something like:
nbd-client bs=4096 mydedubhost port /dev/nbdX
See the man-page of nbd-client for details. Note that the 'bs=4096' parameter should be equal to the block_size configuration parameter of MyDeDub.
Checking how much diskspace is gainedIn MySQL client, enter the following query:
select count(*) / count(distinct(blockid)) from blockmap;Bigger than 1 means space won, less than one: needing more diskspace than the original.
WarningThis is still a alpha version: don't use it with data you don't have a backup of.
LicenseIn short: it is released under GPLv2.
MyDeDub is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.