One of the most important components of a network automation strategy is how to manage your data and/or Source of Truth (SoT). In networking, the data elements are often massive. As an example, most Fortune 500 type companies will have to manage a few hundred thousand to potentially millions of switchports, each with several data points (VLAN, MTU, IP, etc) per interface. The scale and criticality of this data present several challenges. The intention of this blog is describe the need for a versionable database in network automation and serve as an primer to a versionable database.
Problem Space
When managing and storing your Source of Truth data, there are two primary mechanisms, each with it’s own pros and cons.
Source Control (Git)
Provides the ability to version data in such a way the exact state of the data at any point in the past is known.
Provides the ability to know who is the owner of the data.
Provides the ability to populate data in a staging area, without modifying the production data.
Tooling integration with things such as CI Systems.
Has a native querying language to obtain, filter, and all around work with the data.
Provides schema enforcement of the data.
The ability to scale to large datasets.
There is a clear dichotomy between these two choices. On one hand there is great integrations with all standard DevOps tooling, on the other hand there is an enterprise-grade manager of the data.
History
For several years, a few of us have been searching for solutions that combine these two concepts. If one could build a database with Git constructs, this would allow versionable management of your data, with the ability to query, effectively store, and manage at scale. Through various searches it would seem that this has not been solved in any meaningful way. The closest I have come across was a project called NOMS, but it has limitations.
Recently a startup company called liquidata was founded and is creating a solution–a library called Dolt. Dolt is is written on top of the NOMS project, in Go. The intention is to support all Git semantics (such as branch, add, commit, etc..), as well as all MySQL semantics (such as insert, create, update, etc…).
By supporting all MySQL semantics, applications that use a MySQL server should be able to simply use a Dolt-SQL server with no disruption. Whether you are using an ODBC driver or raw SQL, in theory it should still work. You would naturally have to build in the capability for your application to take advantage of the Git capabilities or use traditional “Git-like” workflows.
By supporting all Git semantics, data should be able to be managed in a decentralized fashion using Git workflows to branch, add, diff, and merge the data. In a similar vein to GitHub, they have developed DoltHub, to provide that level of tooling expected in a standard Git user interface, such as forking, API’s, webhooks, and CI integrations.
Primer
In this brief introduction to the technology, we will create a Dolt data repository, called “simple-inventory” and run it on a dolt-sql server. If you care to follow along the only requirements are to have Docker and internet access.
Setup
Start the Docker container.
docker run -it --rm --name dolt golang:1.12.14
Install Dolt by running curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash
root@2a29f8a34dd3:/go# curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed10060106010026470--:--:----:--:----:--:--2647100303810030380083920--:--:----:--:----:--:--8392Downloading:https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/dolt-linux-amd64.tar.gzInstalling dolt, git-dolt and git-dolt-smudge to /usr/local/bin.root@2a29f8a34dd3:/go#
You can create a Dolt data repository by creating a folder, and then initializing the repository. Just like Git, in Dolt, you need to add yourself to the Dolt config. If you are familiar with Git, these commands will be familiar, only changing the command from git to dolt, but keeping all of the other options the same.
root@2a29f8a34dd3:/go# mkdir simple-inventoryroot@2a29f8a34dd3:/go# cd simple-inventoryroot@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.email "ken@celenza.org"Config successfully updated.root@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.name "itdependsnetworks"Config successfully updated.root@2a29f8a34dd3:/go/simple-inventory# dolt initSuccessfully initialized dolt data repository.root@2a29f8a34dd3:/go/simple-inventory# dolt statusOn branch masternothing to commit, working tree cleanroot@2a29f8a34dd3:/go/simple-inventory# dolt logcommit hc8gq0434ckkjeo36rccn55mo14gk87qAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:33+00002020 Initialize data repositoryroot@2a29f8a34dd3:/go/simple-inventory#
Creating Tables
Ideally, if you are familiar with MySQL the only difference should be using Dolt engine instead of a MySQL server, everything should be the same. To enter the Dolt SQL server, simply use the command dolt sql.
root@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
Create tables as you normally would, in this example, we will create two tables. The table device_inventory will have the inventory of the devices, with columns for hostname and IP address. The table vlan will have the hostname (with foreign key relationaship to the device_inventory table) column, VLAN , and name of the VLAN.
Note: Though the foreign key relationaship syntax is defined, it is not a currently supported Dolt feature.
doltsql> CREATE TABLE device_inventory (-> hostname varchar(32) NOT NULL,-> ip_address varchar(15) NOT NULL,-> primary key (`hostname`)-> );doltsql> CREATE TABLE vlan (-> hostname varchar(32),-> vlan int NOT NULL,-> name varchar(32) NOT NULL,-> PRIMARY KEY (hostname, vlan),-> FOREIGN KEY (hostname) REFERENCES device_inventory(hostname)-> );doltsql> exit
Here is where it starts to get interesting–we can combine these two different concepts and see how they interact. So far, the data has not been committed into master, only staged into our local environment. We can view this by issuing standard Git-like commands when we return back to the command line from the dolt-sql server.
Run the dolt status, dolt diff, and dolt diff -q commands.
root@2a29f8a34dd3:/go/simple-inventory# dolt statusOn branch masterUntracked files: (use "dolt add <table>" to include in what will be committed)new table: device_inventorynew table: vlanroot@2a29f8a34dd3:/go/simple-inventory# dolt diffdiff --dolt a/device_inventory b/device_inventoryadded tablediff --dolt a/vlan b/vlanadded tableroot@2a29f8a34dd3:/go/simple-inventory# dolt diff -qCREATE TABLE `device_inventory` (`hostname` LONGTEXT NOT NULL COMMENT 'tag:0',`ip_address` LONGTEXT NOT NULL COMMENT 'tag:1', PRIMARY KEY (`hostname`));CREATE TABLE `vlan` (`hostname` LONGTEXT NOT NULL COMMENT 'tag:0',`vlan` BIGINT NOT NULL COMMENT 'tag:1',`name` LONGTEXT NOT NULL COMMENT 'tag:2', PRIMARY KEY (`hostname`,`vlan`));root@2a29f8a34dd3:/go/simple-inventory#
The dolt status command shows that the schema has not “taken effect” by being merged into master. The dolt diff with optional -q flag, shows that a new table has been created. Note the additional tag parameters, this is logic to track column names, which may change over time and cause name conflicts. The new schema can be committed into master.
Issue the commands dolt add -a, dolt commit -m 'initial schema', and dolt log.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -aroot@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'initial schema'commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9lAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:41+00002020 initial schemaroot@2a29f8a34dd3:/go/simple-inventory# dolt logcommit 4n29eaqb1v9pmnpnda2r43fc91p1vp9lAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:41+00002020 initial schemacommit hc8gq0434ckkjeo36rccn55mo14gk87qAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:33+00002020 Initialize data repositoryroot@2a29f8a34dd3:/go/simple-inventory#
Awesome! The first piece of information of the database was committed.
Adding data
Defining schema is not all the valuable without the data. Again, using standard SQL constructs we can commit the data. Let’s use proper branching strategies this time.
Create a branch and Enter dolt-sql
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b intial_dataSwitched to branch 'intial_data'root@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
Add data
doltsql> INSERT INTO device_inventory (hostname, ip_address) VALUES ("nyc-sw01","10.1.1.1");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO device_inventory (hostname, ip_address) VALUES ("nyc-sw02","10.1.1.2");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01",10,"user");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01",20,"printer");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01",30,"wap");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02",10,"user");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02",20,"printer");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02",30,"wap");+---------+| updated |+---------+|1|+---------+doltsql> exitBye
Now that we have staged the data, we can view the date in two different ways. The first is via raw sql entries, and the other is more akin to a unix diff.
View the diff. In the console, diffs are actually color coded green and red for add and remove respectively.
root@2a29f8a34dd3:/go/simple-inventory# dolt diffdiff --dolt a/device_inventory b/device_inventory--- a/device_inventory @ tktob1spsoos6isfdj4o9benp9c57iic+++ b/device_inventory @ b37haqr0n9j1e5bckj35sqt4kntojlp9+-----+----------+------------+|| hostname | ip_address |+-----+----------+------------+|+| nyc-sw01 |10.1.1.1||+| nyc-sw02 |10.1.1.2|+-----+----------+------------+diff --dolt a/vlan b/vlan--- a/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn+++ b/vlan @ e2kovelv2sfjuo584o0v6hp76207k720+-----+----------+------+---------+|| hostname | vlan | name |+-----+----------+------+---------+|+| nyc-sw01 |10| user ||+| nyc-sw01 |20| printer ||+| nyc-sw01 |30| wap ||+| nyc-sw02 |10| user ||+| nyc-sw02 |20| printer ||+| nyc-sw02 |30| wap |+-----+----------+------+---------+root@2a29f8a34dd3:/go/simple-inventory# dolt diff -qINSERT INTO `device_inventory` (`hostname`,`ip_address`) VALUES ("nyc-sw01","10.1.1.1");INSERT INTO `device_inventory` (`hostname`,`ip_address`) VALUES ("nyc-sw02","10.1.1.2");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",10,"user");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",20,"printer");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",30,"wap");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",10,"user");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",20,"printer");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",30,"wap");root@2a29f8a34dd3:/go/simple-inventory#
Personally, I’m pretty impressed with what is happening here, but there is still more to do to complete this workflow. The data needs to be committed, and merged from the “feature” branch into master branch.
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout masterSwitched to branch 'master'root@2a29f8a34dd3:/go/simple-inventory# dolt branch intial_data* masterroot@2a29f8a34dd3:/go/simple-inventory# dolt merge intial_dataUpdating 4n29eaqb1v9pmnpnda2r43fc91p1vp9l..gbt6r21mtbd0cvs5iahog912essjhn0nFast-forwardroot@2a29f8a34dd3:/go/simple-inventory# dolt logcommit gbt6r21mtbd0cvs5iahog912essjhn0nAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:50+00002020 initial datacommit 4n29eaqb1v9pmnpnda2r43fc91p1vp9lAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:41+00002020 initial schemacommit hc8gq0434ckkjeo36rccn55mo14gk87qAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:33+00002020 Initialize data repositoryroot@2a29f8a34dd3:/go/simple-inventory#
So far this is showing the “Create” part of standard CRUD operations is working, meaning we can add data to the repository.
Update Data
Naturally, as soon as data is entered, you will want to modify it. The process is the same to modify: branch, sql statements, add, commit, and merge.
Checkout a new branch, and enter Dolt SQL.
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b change_vlanSwitched to branch 'change_vlan'root@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
Update the data.
doltsql> update vlan set name ="prnt" where name ="printer";+---------+---------+| matched | updated |+---------+---------+|2|2|+---------+---------+doltsql> exitBye
View the diff.
root@2a29f8a34dd3:/go/simple-inventory# dolt diffdiff --dolt a/vlan b/vlan--- a/vlan @ bjg5ou111lmu03ciu6aisphf7m6jbk1r+++ b/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn+-----+----------+------+---------+|| hostname | vlan | name |+-----+----------+------+---------+|<| nyc-sw01 |20| printer ||>| nyc-sw01 |20| prnt ||<| nyc-sw02 |20| printer ||>| nyc-sw02 |20| prnt |+-----+----------+------+---------+root@2a29f8a34dd3:/go/simple-inventory# dolt diff -qUPDATE `vlan` SET `name`="prnt"WHERE (`hostname`="nyc-sw01" AND `vlan`=20);UPDATE `vlan` SET `name`="prnt"WHERE (`hostname`="nyc-sw02" AND `vlan`=20);root@2a29f8a34dd3:/go/simple-inventory#
A keen eye will note the captured update statements shown in the diff are not the same as the update statement inputted in the SQL server. This conversion makes sense, since the data could be merged on a different set of data that had more or less data in it. It also illustrates the complexity of building this technology.
Commit to a feature branch and merge to master.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -aroot@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'update data'commit jqe3vb84fhfunn7uapksdt5thb4rr7r7Author: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:23:03+00002020 update dataroot@2a29f8a34dd3:/go/simple-inventory# dolt checkout masterSwitched to branch 'master'root@2a29f8a34dd3:/go/simple-inventory# dolt merge change_vlanUpdating gbt6r21mtbd0cvs5iahog912essjhn0n..jqe3vb84fhfunn7uapksdt5thb4rr7r7Fast-forwardroot@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
View the data from dolt sql.
doltsql> select * from vlan;+----------+------+------+| hostname | vlan | name |+----------+------+------+| nyc-sw01 |10| user || nyc-sw01 |20| prnt || nyc-sw01 |30| wap || nyc-sw02 |10| user || nyc-sw02 |20| prnt || nyc-sw02 |30| wap |+----------+------+------+doltsql>
You can also view who is the owner of the data by tracking to the commit, which includes the auther, time, and commit hash metadata as well, using the dolt blame <table> command.
root@2a29f8a34dd3:/go/simple-inventory# dolt blame vlan+----------+------+--------------+-------------------+------------------------------+-------------------+| HOSTNAME | VLAN | COMMIT MSG | AUTHOR | TIME | COMMIT |+----------+------+--------------+-------------------+------------------------------+-------------------+| nyc-sw01 |30| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw02 |10| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw02 |20| update data | itdependsnetworks | Tue Jan 2804:23:03 UTC 2020| jqe3vb84fhfunn7ua || nyc-sw02 |30| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw01 |10| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw01 |20| update data | itdependsnetworks | Tue Jan 2804:23:03 UTC 2020| jqe3vb84fhfunn7ua |+----------+------+--------------+-------------------+------------------------------+-------------------+root@2a29f8a34dd3:/go/simple-inventory#
Conclusion
This is just a primer, and there is still a lot of work to be done to truly have feature parity with both MySQL and Git, but this is a great step in the right direction. I plan to continue to monitor and follow up with examples, use cases, and library updates over time. Specifically, next time, I want to extend the workflow to include DoltHub to review those capabilities as well.
Does this all sound amazing? Want to know more about how Network to Code can help you do this, reach out to our sales team. If you want to help make this a reality for our clients, check out our careers page.