{"id":267,"date":"2015-06-10T13:25:19","date_gmt":"2015-06-10T13:25:19","guid":{"rendered":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/?p=267"},"modified":"2015-09-12T20:15:10","modified_gmt":"2015-09-12T20:15:10","slug":"converting-input-into-events","status":"publish","type":"post","link":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/?p=267","title":{"rendered":"Converting input into events"},"content":{"rendered":"<p>Orchids takes input from various data sources, and breaks them down into\u00a0<em>events<\/em>. \u00a0The purpose of this post is to explain what it means, how it works, and how one should write event-generating modules.<!--more--><\/p>\n<h3>Building events<\/h3>\n<p>Consider the following example. The <code>mod_textfile<\/code> module reads data from text files (and also Unix sockets, and pipes), and produces one event per text line. An Orchids event is just a list of pairs, consisting of a field and its value. The type of Orchids events is:<\/p>\n<pre>typedef struct event_s event_t;\r\nstruct event_s\r\n{\r\n  gc_header_t gc;\r\n  int32_t    field_id;\r\n  ovm_var_t *value;\r\n  event_t   *next;\r\n};<\/pre>\n<p>Formally, an object of type <code>event_t<\/code> is a pair consisting of a field (equated with its field id <code>field_id<\/code>) and its value <code>value<\/code>, plus a pointer <code>next<\/code> to subsequent pairs. The <code>NULL<\/code> pointer serves as the end of the list.<\/p>\n<p>Imagine the\u00a0<code>mod_textfile<\/code>\u00a0module reads the following line of text from file<br \/>\n<code>\/Users\/goubault\/Desktop\/Code\/ORCHIDS\/orchids_new_version\/tests\/macosx\/authd.log<\/code>, as its first line:<\/p>\n<pre>May\u00a0 9 10:37:07 MacBook-Pro-de-Jean.local com.apple.authd[36]: Succeeded authorizing right 'system.login.console' by client '\/System\/Library\/CoreServices\/loginwindow.app' [55916] for authorization created by '\/System\/Library\/CoreServices\/loginwindow.app' [55916] (3,0)\\n<\/pre>\n<p>(Although this is anecdotical, note the final newline <code>\\n<\/code>.) The <code>mod_textfile<\/code> will then package that as an event which, in Orchids syntax, would be written:<\/p>\n<pre>.{.textfile.file = \"\/Users\/goubault\/Desktop\/Code\/ORCHIDS\/orchids_new_version\/tests\/macosx\/authd.log\",\r\n  .textfile.line_num = 1,\r\n  .textfile.line = \"May\u00a0 9 10:37:07 MacBook-Pro-de-Jean.local com.apple.authd[36]: Succeeded authorizing right 'system.login.console' by client '\/System\/Library\/CoreServices\/loginwindow.app' [55916] for authorization created by '\/System\/Library\/CoreServices\/loginwindow.app' [55916] (3,0)\\n\"\r\n }<\/pre>\n<p>This is an event with three fields. Inside Orchids, this is an object of type <code>event_t<\/code>. Let <em>base<\/em> be the first field id attributed to the <code>mod_textfile<\/code> module by the Orchids module configuration mechanism. (While I&#8217;m writing this, I am running a copy of Orchids, and in my case, <em>base<\/em> equals 4, but that may, and will, vary.) In writing <code>mod_textfile<\/code>, we have decided to number our fields as follows:<\/p>\n<pre>#define TF_FIELDS  3   \/* number of fields known to mod_textfile *\/\r\n#define F_LINE_NUM 0        \/* the .textfile.line_num field *\/\r\n#define F_FILE     1        \/* the .textfile.file field *\/\r\n#define F_LINE     2        \/* the .textfile.line field *\/\r\n<\/pre>\n<p>I&#8217;ll tell you later how the connection with field names is established. \u00a0For now, my point is that\u00a0the\u00a0<code>mod_textfile<\/code>\u00a0module will create an\u00a0<code>event_t<\/code>\u00a0object:<\/p>\n<ul>\n<li>with\u00a0<code>field_id<\/code>\u00a0equal to\u00a0<em>base<\/em>+<code>F_LINE_NUM<\/code>,\u00a0<code>value<\/code>\u00a0equal to 1 (packaged as an\u00a0<code>ovm_int_t<\/code>), and\u00a0<code>next<\/code>\u00a0equal to a pointer \u00a0to&#8230;<\/li>\n<li>another\u00a0<code>event_t<\/code>\u00a0object, with\u00a0<code>field_id<\/code>\u00a0equal to\u00a0<em>base<\/em>+<code>F_FILE<\/code>,\u00a0<code>value<\/code> equal to<br \/>\n<code>\"\/Users\/goubault\/Desktop\/Code\/ORCHIDS\/orchids_new_version\/tests\/macosx\/authd.log\"<\/code>\u00a0(packaged as an\u00a0<code>ovm_str_t<\/code>), and\u00a0<code>next<\/code>\u00a0equal to a pointer \u00a0to&#8230;<\/li>\n<li>another\u00a0<code>event_t<\/code>\u00a0object, with\u00a0<code>field_id<\/code>\u00a0equal to\u00a0<em>base<\/em>+<code>F_LINE<\/code>,\u00a0<code>value<\/code> equal to<br \/>\n<code>\"May\u00a0 9 10:37:07 MacBook-Pro-de-Jean.local com.apple.authd[36]: Succeeded authorizing right 'system.login.console' by client '\/System\/Library\/CoreServices\/loginwindow.app' [55916] for authorization created by '\/System\/Library\/CoreServices\/loginwindow.app' [55916] (3,0)\\n\"<\/code>\u00a0(packaged as an\u00a0<code>ovm_str_t<\/code>), and\u00a0<code>next<\/code>\u00a0equal to <code>NULL<\/code>.<\/li>\n<\/ul>\n<p>The preferred way of doing so is as follows. \u00a0First, we allocate space for the maximum number of fields, plus one slot to hold the final value of the event.<\/p>\n<pre> GC_START(gc_ctx, TF_FIELDS+1);<\/pre>\n<p>Here <code>gc_ctx<\/code> is a pointer to our GC context. <code>GC_START()<\/code> is usually meant to allocate space on the stack that is known to the garbage-collector, and we&#8217;ll use it also to store the values we are interested in.<\/p>\n<p>We now fill in this array with values. \u00a0For example, assuming the current line number is in <code>tf-&gt;line<\/code>, we write:<\/p>\n<pre>  val = ovm_int_new (gc_ctx, tf-&gt;line);\r\n  GC_UPDATE(gc_ctx, F_LINE_NUM, val);\r\n<\/pre>\n<p>This packages the line number as an Orchids object <code>val<\/code>, of type <code>ovm_int_t<\/code> (or rather, its super type <a title=\"The ovm_var_t universal type\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/?p=23\"><code>ovm_var_t<\/code><\/a>), and then stores that object inside the array. \u00a0Note that <code>F_LINE_NUM<\/code> is used as an index into that array, and will also be used to compute the field id (by adding <em>base<\/em> to it).<\/p>\n<p>We do the same thing for all other fields\u2014or only for some of them: if you don&#8217;t store anything at index\u00a0<code>F_FILE<\/code>, for example, what we are in the process of describing will just build an event from which the\u00a0<code>.textfile.file<\/code>\u00a0field\u00a0absent, that is all.<\/p>\n<p>When all the fields we are interested in have been set this way, it only remains to call:<\/p>\n<pre> REGISTER_EVENTS(ctx, mod, TF_FIELDS, 0);\r\n GC_END(gc_ctx);<\/pre>\n<p>The <code>REGISTER_EVENTS()<\/code> macro takes the Orchids context <code>ctx<\/code>, the current module <code>mod<\/code>, the number of fields, and the dissector level (here, 0; there is a beginning of an explanation on the latter <a title=\"Reading variable-length event sources\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/?p=248\">here<\/a>). \u00a0We use\u00a0<code>GC_END()<\/code>\u00a0to free the array allocated by\u00a0<code>GC_END()<\/code>, see <a title=\"Writing functions that allocate memory\" href=\"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/?p=150\">this post<\/a>.<\/p>\n<h3>Registering events<\/h3>\n<p>The\u00a0<code>REGISTER_EVENTS()<\/code>\u00a0macro expands to a call to two functions in the Orchids API:\u00a0<code>add_fields_to_event()<\/code>, and <code>post_event()<\/code>.<\/p>\n<p>The first of these functions:<\/p>\n<pre>void add_fields_to_event(orchids_t *ctx, mod_entry_t *mod,\r\n                         event_t **event, ovm_var_t **tbl_event, size_t sz);<\/pre>\n<p>takes an array <code>tbl_event<\/code> of <code>sz<\/code> Orchids values and adds them to the front of the event (=a list of pairs) <code>*event<\/code>. This is used by the <code>REGISTER_EVENTS(ctx, mod, nevents, dissection_level)<\/code> macro, which calls <code>add_fields_to_event (ctx, mod, (event_t **)&amp;GC_LOOKUP(nevents), (ovm_var_t **)GC_DATA(), nevents)<\/code>: the array is the one allocated by <code>GC_START()<\/code>, namely <code>GC_DATA()<\/code>, and contains <code>nevents<\/code> field-value pairs; the event itself is stored in the remaining slot of the array (remember it contains <code>nevents<\/code>+1 fields; in our example, <code>nevents<\/code>=<code>TF_FIELDS<\/code>), namely <code>&amp;GC_LOOKUP(nevents)<\/code>.<\/p>\n<p>One can also use\u00a0<code>add_fields_to_event()<\/code>\u00a0directly, and some modules do it. \u00a0A refined function is:<\/p>\n<pre>void add_fields_to_event_stride(orchids_t *ctx, mod_entry_t *mod,\r\n                                event_t **event, ovm_var_t **tbl_event,\r\n                                size_t from, size_t to);<\/pre>\n<p>Instead of sweeping through the whole\u00a0<code>tbl_event<\/code>\u00a0table, it only looks at the entries numbered <code>from<\/code>, <code>from<\/code>+1, &#8230;, <code>to<\/code>-1. The purpose is efficiency. Some modules have a high number of field ids (604 for <code>mod_openbsm<\/code>), but most events will only use a much smaller number of fields. Imagine you create an event containing field ids 230, 231, and 232. Calling <code>add_fields_to_event()<\/code> will produce a three field-value pair event, but to do so, it will sweep through the 604 possible entries in the <code>tbl_event<\/code> table. Instead, call <code>add_fields_to_event_stride(ctx, mod, event, &amp;tbl_event[230], 230, 233)<\/code>: this will only look at three entries, which is must faster. (Pay attention that the table argument should be <code>&amp;tbl_event[230]<\/code>, not <code>tbl_event<\/code>. You will realize that it is more natural in the long run. \u00a0Also, <code>to<\/code> is equal to 233, not 232.)<\/p>\n<p>The second of these functions:<\/p>\n<pre>void post_event(orchids_t *ctx, mod_entry_t *sender, event_t *event,\r\n                     int dissection_level);\r\n<\/pre>\n<p>posts the event we have just built to the Orchids engine. This function will do one thing among the following two:<\/p>\n<ul>\n<li>If the current module (namely, the <code>sender<\/code> argument to <code>post_event()<\/code>) has a dissector, then it will call that dissector. \u00a0This will be the subject of another post, suffice it to say that that dissector will take the second field-value pair of the event we have just generated (here, the\u00a0<code>.textfile.line<\/code>\u00a0entry), break it into further fields, which it will add the the current event (the\u00a0<code>event<\/code>\u00a0argument) before it calls\u00a0<code>post_event()<\/code>\u00a0again. \u00a0This is done just as above.<\/li>\n<li>If the current module does not have a dissector, it will inject the event\u00a0(the\u00a0<code>event<\/code>\u00a0argument) into the Orchids engine, by calling\u00a0<code>inject_event():<\/code>\n<pre>void inject_event(orchids_t *ctx, event_t *event);<\/pre>\n<p>In turn, this will launch new Orchids threads, try to advance old ones, executing Orchids rules&#8230; the whole Orchids engine will start up for real. This will be explained in another post.<\/li>\n<\/ul>\n<h3>Field names and field ids<\/h3>\n<p>We haven&#8217;t yet explained how Orchids connected field names with field ids. \u00a0Still taking the example of the\u00a0<code>mod_textfile<\/code>\u00a0module, we define the following table:<\/p>\n<pre>static field_t tf_fields[] = {\r\n  { \"textfile.line_num\", &amp;t_uint, MONO_MONO,  \"line number\"                },\r\n  { \"textfile.file\",     &amp;t_str, MONO_UNKNOWN, \"source file name\"            },\r\n  { \"textfile.line\",     &amp;t_str, MONO_UNKNOWN,  \"current line\" }\r\n};\r\n<\/pre>\n<p>This declares all the <code>mod_textfile<\/code>\u00a0fields that we would like Orchids to know about. \u00a0 The numbering of fields is\u00a0<em>implicit<\/em>: for example,\u00a0<code>.textfile.line_num<\/code>\u00a0will necessarily be field number 0 in this module. \u00a0(Hence its field id will be\u00a0<em>base<\/em>+0, where\u00a0<em>base<\/em> is\u00a0the first field id attributed to the\u00a0<code>mod_textfile<\/code>\u00a0module by the Orchids module configuration mechanism.) \u00a0This is why we defined\u00a0<code>F_LINE_NUM<\/code>\u00a0as 0,\u00a0<code>F_FILE<\/code>\u00a0as 1, and\u00a0<code>F_LINE<\/code>\u00a0as 2. \u00a0The field numbers and the field table should always be modified conjointly! \u00a0If you add a field, you must add it to the field table (here, <code>tf_fields[]<\/code>), and add its field number (to\u00a0<code>mod_textfile.h<\/code>) at the same time.<\/p>\n<p>Each field comes with additional typing information (e.g.,\u00a0<code>&amp;t_uint<\/code>\u00a0declares the field as an unsigned integer), monotonicity information (<code>MONO_MONO<\/code>\u00a0declares the field as\u00a0increasing through time,\u00a0<code>MONO_ANTI<\/code>\u00a0declares it as\u00a0decreasing\u00a0through time,\u00a0<code>MONO_CONST<\/code>\u00a0declares it as constant, and\u00a0<code>MONO_UNKNOWN<\/code>\u00a0in all other cases), and a short description string.<\/p>\n<p>We use the field table to inform Orchids of the connection by calling <code>register_fields()<\/code> in the module&#8217;s pre-configuration function <code>textfile_preconfig()<\/code>, e.g.:<code><\/code><\/p>\n<pre> register_fields(ctx, mod, tf_fields, TF_FIELDS);<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Orchids takes input from various data sources, and breaks them down into\u00a0events. \u00a0The purpose of this post is to explain what it means, how it works, and how one should write event-generating modules.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"class_list":["post-267","post","type-post","status-publish","format-standard","hentry","category-event-management"],"_links":{"self":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=\/wp\/v2\/posts\/267","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=267"}],"version-history":[{"count":9,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=\/wp\/v2\/posts\/267\/revisions"}],"predecessor-version":[{"id":290,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=\/wp\/v2\/posts\/267\/revisions\/290"}],"wp:attachment":[{"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=267"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=267"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/projects.lsv.ens-paris-saclay.fr\/orchidsdev\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=267"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}