{"id":131,"date":"2015-01-18T16:57:30","date_gmt":"2015-01-18T16:57:30","guid":{"rendered":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=131"},"modified":"2017-11-13T09:07:36","modified_gmt":"2017-11-13T09:07:36","slug":"dissection-modules","status":"publish","type":"page","link":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=131","title":{"rendered":"Dissection modules"},"content":{"rendered":"<p>The purpose of dissection modules is to parse data into Orchids event fields.<\/p>\n<p>Henceforth, <em>dissecting<\/em> will mean the same thing as <em>parsing<\/em>.<\/p>\n<p>Let us take an example.\u00a0 Imagine you want to give the contents of file <code>\/var\/log\/messages<\/code> as (polled) input to Orchids. This is a file in <a title=\"syslog format\" href=\"https:\/\/en.wikipedia.org\/wiki\/Syslog#Format_of_a_Syslog_packet\"><code>syslog<\/code> format<\/a>, and you would like Orchids to be able to parse this format. We write:<\/p>\n<pre>INPUT\t\ttextfile\t\"\/var\/log\/messages\"\r\nDISSECT syslog\ttextfile\t\"\/var\/log\/messages\"\r\n<\/pre>\n<p>in the <a title=\"orchids-inputs.conf\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=145\"><code>orchids-inputs.conf<\/code><\/a> configuration file. The first line will tell Orchids<br \/>\nto use the <a title=\"The textfile module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=87\"><code>textfile<\/code><\/a> input module to read data from file <code>\/var\/log\/messages<\/code>. On reading, say, the following line from <code>\/var\/log\/messages<\/code> (say, line 1065):<\/p>\n<pre>Apr 25 23:22:40 laramie sendmail[102]: NOQUEUE: SYSERR(root): \/etc\/sendmail.cf: line 0: cannot open: No such file or directory\r\n<\/pre>\n<p>the <a title=\"The textfile module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=87\"><code>textfile<\/code><\/a> module alone would produce the event:<\/p>\n<table style=\"border: solid 1px black;\">\n<tbody>\n<tr style=\"background-color: lightsteelblue;\">\n<th>Field<\/th>\n<th>Contents<\/th>\n<\/tr>\n<\/tbody>\n<tbody>\n<tr style=\"background-color: lightgrey;\">\n<td><code>.textfile.line_num<\/code><\/td>\n<td>1065<\/td>\n<\/tr>\n<tr style=\"background-color: white;\">\n<td><code>.textfile.file<\/code><\/td>\n<td><code>\"\/var\/log\/messages\"<\/code><\/td>\n<\/tr>\n<tr style=\"background-color: lightgrey;\">\n<td><code>.textfile.line<\/code><\/td>\n<td><code>\"Apr 25 23:22:40 laramie sendmail[102]: NOQUEUE: SYSERR(root): \/etc\/sendmail.cf: line 0: cannot open: No such file or directory\"<\/code><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The second line of our configuration file, <code>DISSECT syslog textfile \"\/var\/log\/messages\"<\/code>, will tell Orchids that for all input lines coming from the same source (<code>\/var\/log\/messages<\/code>) from the same module (<a title=\"The textfile module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=87\"><code>textfile<\/code><\/a>), the <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> dissection module should be used to parse the <code>.textfile.line<\/code> string.<\/p>\n<p>With the aformentioned <code>DISSECT<\/code> directive, the <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> module will add further fields, resulting in the following event, which is then fed to the Orchids engine:<\/p>\n<table style=\"border: solid 1px black;\">\n<tbody>\n<tr style=\"background-color: lightsteelblue;\">\n<th>Field<\/th>\n<th>Contents<\/th>\n<\/tr>\n<\/tbody>\n<tbody>\n<tr style=\"background-color: lightgrey;\">\n<td><code>.textfile.line_num<\/code><\/td>\n<td>1065<\/td>\n<\/tr>\n<tr style=\"background-color: white;\">\n<td><code>.textfile.file<\/code><\/td>\n<td><code>\"\/var\/log\/messages\"<\/code><\/td>\n<\/tr>\n<tr style=\"background-color: lightgrey;\">\n<td><code>.textfile.line<\/code><\/td>\n<td><code>\"Apr 25 23:22:40 laramie sendmail[102]: NOQUEUE: SYSERR(root): \/etc\/sendmail.cf: line 0: cannot open: No such file or directory\"<\/code><\/td>\n<\/tr>\n<tr style=\"background-color: white;\">\n<td><code>.syslog.time<\/code><\/td>\n<td>April 25, 23:22:40\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 (as a value of type <code>ctime<\/code>, not <code>str<\/code>)<\/td>\n<\/tr>\n<tr style=\"background-color: lightgrey;\">\n<td><code>.syslog.host<\/code><\/td>\n<td><code>\"laramie\"<\/code><\/td>\n<\/tr>\n<tr style=\"background-color: white;\">\n<td><code>.syslog.pid<\/code><\/td>\n<td>102<\/td>\n<\/tr>\n<tr style=\"background-color: lightgrey;\">\n<td><code>.syslog.prog<\/code><\/td>\n<td><code>\"sendmail\"<\/code><\/td>\n<\/tr>\n<tr style=\"background-color: white;\">\n<td><code>.syslog.msg<\/code><\/td>\n<td><code>\"NOQUEUE: SYSERR(root): \/etc\/sendmail.cf: line 0: cannot open: No such file or directory\"<\/code><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>How it works<\/h3>\n<p>At configuration time, the <code>DISSECT<\/code> directive uses its third argument (here, the string <code>\"\/var\/log\/messages\"<\/code>) as a <em>tag<\/em>. There may be several <code>DISSECT<\/code> directives associated to the same pair of modules (here, <a title=\"The textfile module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=87\"><code>textfile<\/code><\/a> and\u00a0 <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a>), provided they have different tags.\u00a0 This is so that the same dissection module can dissect several different sources of events.<\/p>\n<p>Remember from the <a title=\"Input modules\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=126\">input modules<\/a> page that the last and next-to-last fields in Orchids events play a special role:<\/p>\n<ul>\n<li>The <strong>next-to-last<\/strong> field is used as an index.<\/li>\n<li>The <strong>last<\/strong> field is the contents to be parsed.<\/li>\n<\/ul>\n<p>In our example above, the next-to-last field produced by the <a title=\"The textfile module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=87\"><code>textfile<\/code><\/a> module is the\u00a0<code>.textfile.file<\/code> field.\u00a0 When its value (here, <code>\"\/var\/log\/messages\"<\/code>) matches the given tag, the corresponding dissection module is called, and will parse the last field; here, the <code>.textfile.line<\/code> field.<\/p>\n<p>Here are a few noteworthy features of dissection modules:<\/p>\n<ul>\n<li>Dissection modules <strong>concatenate<\/strong> their own list of fields to the input event.\u00a0 Therefore, the fields of the original input event (such as <code>.textfile.file<\/code> above) remain available, if needed.<\/li>\n<li>Dissection modules can dissect Orchids events output by <em>any<\/em> module, not just input modules, provided that it makes sense.<br \/>\nFor example, Orchids events produced by the <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> module can themselves be (re)dissected, say by the <code>generic<\/code> module. Note that care was taken that Orchids events produced by the <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> module such as in the above example both have a next-to-last field, serving as a dissection tag (here, of value <code>\"sendmail\"<\/code>), and a last field, which would be parsed further (the <code>.syslog.msg<\/code> field ).<br \/>\nThis allows one to <strong>cascade<\/strong> dissection modules.\u00a0 This is described in more detail in the next section of this page.<\/li>\n<li>As we have said earlier, the <em>same<\/em> dissection module can be used to dissect data, with the same format, coming from different sources.\u00a0 Therefore, one can for example write the following in the <a title=\"orchids-inputs.conf\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=145\"><code>orchids-inputs.conf<\/code><\/a> configuration file:\n<pre># Syslog events\r\nINPUT\t\ttextfile\t\"\/var\/log\/messages\"\r\nDISSECT syslog\ttextfile\t\"\/var\/log\/messages\"\r\nINPUT\t\ttextfile\t\"\/var\/log\/auth.log\"\r\nDISSECT syslog\ttextfile\t\"\/var\/log\/auth.log\"\r\n## (standard syslog udp)\r\nINPUT\t\t\t        udp\t514\r\nDISSECT\t\tbintotext\tudp\t514\r\nDISSECT\t\tsyslog\tbintotext\t514\r\n<\/pre>\n<p>This example also displays cascading (last three lines), for <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> data transmitted over a <a title=\"The udp module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=171\"><code>udp<\/code><\/a> connection.<\/li>\n<\/ul>\n<h2>Advanced cascading<\/h2>\n<p>We have already seen cascading.\u00a0 For example:<\/p>\n<pre>INPUT\t\t\t        udp\t514\r\nDISSECT\t\tbintotext\tudp\t514\r\nDISSECT\t\tsyslog\tbintotext\t514\r\n<\/pre>\n<p>tells Orchids to receive some data with the\u00a0<a title=\"The udp module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=171\"><code>udp<\/code><\/a> module with tag 514 (serving as port name).\u00a0 Whatever is received this way is fed to the <a href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=121\">bintotext<\/a> module, which cuts up raw packets into text lines, then to the <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> module for parsing.<\/p>\n<p>We can continue this way: the last field of an event parsed by\u00a0the <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> module is <code>.syslog.msg<\/code>, which can be parsed further, for example by the <code>generic<\/code> module or the <code>json<\/code> module.\u00a0 However, it would be a mistake to write the following:<\/p>\n<pre>INPUT\t\t\t        udp\t514\r\nDISSECT\t\tbintotext\tudp\t514\r\nDISSECT\t\tsyslog\tbintotext\t514\r\nDISSECT         generic syslog 514<\/pre>\n<p>The reason is that 514 will no longer be the correct dissection tag. Recall that <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> will add its own fields to the event. Hence the new tag, provided by <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a>, is the contents of its next-to-last field, <code>.syslog.prog<\/code>.<\/p>\n<p>This is done on purpose.\u00a0 It allows you to connect a new dissector based on the value of the program reporting an event through <code>syslog<\/code>.\u00a0 For example, we may write:<\/p>\n<pre>INPUT\t\t\t        udp\t514\r\nDISSECT\t\tbintotext\tudp\t514\r\nDISSECT\t\tsyslog\tbintotext\t514\r\nDISSECT         my_sendmail_dissection_module syslog \"sendmail\"<\/pre>\n<p>for some hypothetical <code>my_sendmail_dissection_module<\/code> module, meant to parse <code>sendmail<\/code> messages further.<\/p>\n<p>The <code>generic<\/code> module does it differently, though: you just need to write<\/p>\n<pre>INPUT\t\t\t        udp\t514\r\nDISSECT\t\tbintotext\tudp\t514\r\nDISSECT\t\tsyslog\tbintotext\t514<\/pre>\n<p>and the <code>generic<\/code> will actually plug itself onto all <a title=\"The syslog module\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=157\"><code>syslog<\/code><\/a> produced events, automatically.\u00a0 What it does is described in the <a href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=628\"><code>generic<\/code><\/a> module&#8217;s configuration file.<\/p>\n<p>(That is not fixed in stone. In principle, all plumbing should be visible in the <code>orchids-inputs.conf<\/code> file, and the way the <a href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=628\"><code>generic<\/code><\/a> module does it contradicts this principle.\u00a0 Hence you may expect the ways of\u00a0the <a href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=628\"><code>generic<\/code><\/a> module to change.)<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The purpose of dissection modules is to parse data into Orchids event fields. Henceforth, dissecting will mean the same thing as parsing. Let us take an example.\u00a0 Imagine you want to give the contents of file \/var\/log\/messages as (polled) input to Orchids. This is a file in syslog format, and you would like Orchids to &hellip; <a href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/?page_id=131\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Dissection modules<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"open","template":"","meta":{"footnotes":""},"class_list":["post-131","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=\/wp\/v2\/pages\/131","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=131"}],"version-history":[{"count":17,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=\/wp\/v2\/pages\/131\/revisions"}],"predecessor-version":[{"id":649,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=\/wp\/v2\/pages\/131\/revisions\/649"}],"wp:attachment":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdoc\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=131"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}