Abstract: In this post, readers will gain some insight into the architecture of the hpfriends platform. Users who want to learn how to use hpfriends should have a look at the guide on using hpfriends.
hpfeeds is a simple publish/subscribe data sharing model. It was initially created by Mark ‘rep’ Schloesser as a way to carry high-volume real-time data from different pieces of honeypot software between members of the Honeynet project. hpfriends is an evolution of the hpfeeds data sharing model. It uses the same wire protocol and thus maintains backwards-compatibility with all data sources / sinks.
However, instead of relying on access authorization based on channels, hpfriends uses a social graph in order to make sharing data even easier and more natural. This also circumvents some policy issues about which people should be able to use hpfeeds and who should be allowed onto which channels.
Channels in hpfeeds are commonly used to group data of the same origin and structure. A channel dedicated to data from
the dionaea honeypot software can be expected to carry data adherring to a specific format, suitable for automated
processing. Before hpfriends, channel names were unique and global within the hpfeeds system. This meant that once a
channel was established by a user (such as
dionaea.captures), that user had to grant permission each time another user
wanted to publish or subscribe to the channel
hpfriends uses a social sharing graph as its backend database. That means that users and groups are represented as nodes, while the sharing relationships are modelled as edges between those nodes. Sharing is no binary attribute, which is why each edge contains a number of attributes, such as the type of the relationship and the channel name.
A (very small) subset of the hpfriends sharing graph.
Channels are no longer global broadcast rooms but rather local decisions. For each message on a channel and user node, a decision is made whether the channel can be accessed by other users based on the existence of a sharing relationship between the original user and other users and groups. Since a channel name can be used by anyone to publish messages, channels have de-facto become exclusive mandatory tags for messages.
If user jojo has a sharing relationship on channel
dionaea.captures with user mark, then mark will be able
to read messages on that channel. Other user groups might also use the channel name
dionaea.captures, yet have no
sharing relationships with jojo or mark. User mark may in turn decide to also share his
messages with jojo, or even re-distribute the messages he received from jojo to third parties, i.e. other users
and groups. The difference between sharing only messages generated by the user or also incoming messages by other users
is indicated using the
Groups are used to address a number of people without having to establish sharing relationsships with each individual member. In our database, groups a represented as nodes which have an edge to each member.
hpfriends uses Authkeys in the same fashion as the hpfeeds system. In fact, the hpfriends message broker is backwards-compatible to the hpfeeds tools. Authkeys are used as single-purpose tokens for different pieces of backend software which is needed to publish and subscribe to data-channels. Each Authkey has a list of channels it’s allowed to publish and subscribe to.
The hpfriends system is still very much work-in-progress, which is why the implementation details are subject to change.
The sharing graph was realized with the Neo4j database, an Open-Source graph-database implemented in Java. Although Neo4j enables very efficient graph storage and operations it does have a few quirks. Concerning hpfeeds, it is obviously quite impractical and unnecessary to do a lookup on the sharing graph for each new message on a channel. Instead, we traverse the graph if the sharing relationships change and save the sharing attributes in a flat format afterwars.