{"id":3712,"date":"2023-05-24T10:36:43","date_gmt":"2023-05-24T08:36:43","guid":{"rendered":"https:\/\/risc.web-email.at\/fachbeitrag-datenqualitaet-in-der-praxis\/"},"modified":"2023-11-02T16:39:20","modified_gmt":"2023-11-02T15:39:20","slug":"technical-article-data-quality-in-practice","status":"publish","type":"publication","link":"https:\/\/risc.web-email.at\/en\/technicalarticles\/technical-article-data-quality-in-practice\/","title":{"rendered":"Data quality in practice"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">by DI Paul Heinzlreiter<\/h3>\n\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<p><em>One of the central goals of data engineering is the preparation of data sets according to the requirements of the user or the subsequent process steps. The use of data can range from model training in the field of machine learning to improved internal company reporting based on an integrated database.<\/em><\/p>\n\n\n\n<p><em>Ensuring sufficient data quality is central in all cases. While the various fundamental aspects of data quality and their importance for companies were examined in an <a href=\"http:\/\/ris.w4.at\/en\/technical-article-data-quality\" target=\"_blank\" rel=\"noreferrer noopener\">earlier article<\/a>, this article presents examples of data quality problems from practice and discusses possible solutions.<\/em><\/p>\n\n\n\n<p><br><br><\/p>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-media-text has-media-on-the-right is-stacked-on-mobile is-vertically-aligned-center\"><div class=\"wp-block-media-text__content\">\n<p><strong>Table of contents<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data formats<\/li>\n\n\n\n<li>Error cause in structured text data<\/li>\n\n\n\n<li>Categories of data errors in structured data<\/li>\n\n\n\n<li>Sample data set<\/li>\n\n\n\n<li>Methods of data troubleshooting<\/li>\n\n\n\n<li>Algorithmic data error recovery<\/li>\n\n\n\n<li>Data loss prevention<\/li>\n\n\n\n<li>Role of data quality in project planning<\/li>\n\n\n\n<li>Author<\/li>\n<\/ul>\n<\/div><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" width=\"1024\" height=\"681\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock_000008388429XLarge-1024x681.jpg\" alt=\"Data\" class=\"wp-image-3706 size-full\" srcset=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock_000008388429XLarge-1024x681.jpg 1024w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock_000008388429XLarge-300x200.jpg 300w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock_000008388429XLarge-768x511.jpg 768w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock_000008388429XLarge-1536x1022.jpg 1536w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock_000008388429XLarge.jpg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Data formats<\/h3>\n\n\n\n<p>A data error always represents a deviation from a target value. This means that possible data errors are strongly dependent on the type and format of the available data. Essentially, a distinction must be made here between structured and unstructured data. Unstructured data &#8211; especially text data &#8211; usually do not follow a schema, which means that a data error can only be detected by machine in rare cases.<\/p>\n\n\n\n<p>In text files, a typical example is incorrect localisation of floating point values due to inconsistent use of the decimal separator:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td class=\"has-text-align-left\" data-align=\"left\"><code>1.23<\/code><br><code>6.4532<\/code><br><code>7,564<\/code><br><code>-0.2<\/code><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In this example, the wrong decimal separator is used in the third line, in this case the comma, which is common in German-speaking countries. Determining which decimal separator is the correct one in each case is usually done by external additional information or by determining the majority within the given data.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Unstructured text data<\/h4>\n\n\n\n<p>Possible data errors in text files:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Undefined or deviating character sets:<br>A character set describes the mapping of characters (a, b, \u00e4, \u20ac, \u2026) to their binary representation in memory. If this is not correctly defined or is unknown, this leads to incorrect representation and processing of special characters such as German umlauts.<\/li>\n\n\n\n<li>Encoding of line breaks:<br>A line break is represented differently between the operating systems Microsoft Windows, Apple MacOS and GNU\/Linux:\n<ul class=\"wp-block-list\">\n<li>In Windows, two characters are used for this purpose: A sequence of Carriage Return (ascii code 13) and Line Feed (ascii code 10).<\/li>\n\n\n\n<li>In MacOS only Carriage Return is used.<\/li>\n\n\n\n<li>In GNU\/Linux only Line Feed is used.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Different localisation of the data such as German and English decimal separators<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Binary formats and structured text files<\/h4>\n\n\n\n<p>In contrast, structured data is based on a schema that contains the data format, the structure of the data, as well as the data types and value ranges of the data values contained. Data schemas can be explicit or implicit depending on the data format and describe, for example, tabular data per column:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data type<\/li>\n\n\n\n<li>cells can contain zero values<\/li>\n\n\n\n<li>Validity range for numerical values<\/li>\n\n\n\n<li>Format for string values (e.g. date and timestamp)<\/li>\n<\/ul>\n\n\n\n<p>In any case, a data schema makes it possible to validate the content of a data set or to check for errors. Because it is easier to check by machine, this article focuses on structured data, such as those that occur in an industrial environment. From the point of view of data validation, structured data can be divided into two rough classes. These differ in whether the format already provides the data schema:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Binary data formats with schema in the metadata provided<\/strong>: Examples are storage formats of commercial programs such as Microsoft Excel, as well as image files, standardised binary protocols such as OPC UA or Protobuf, but also open BigData formats such as <a href=\"https:\/\/parquet.apache.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Apache Parquet<\/a>. Another very typical class of storage solutions that fall into this category are relational databases such as Microsoft SQL Server, PostgreSQL or MySQL.<\/li>\n\n\n\n<li>Structured text files without schema information in the data format:\n<ul class=\"wp-block-list\">\n<li>Comma separated values (CSV)<\/li>\n\n\n\n<li>XML files<\/li>\n\n\n\n<li>JSON files<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<h3 class=\"wp-block-heading\">Error cause in structured text data<\/h3>\n\n\n\n<p>While XML or JSON files rarely contain syntactic errors, as they are usually generated programmatically, data errors occur more frequently in CSV files, as these are often maintained manually (e.g. in Microsoft Excel). Typical causes of data format inconsistencies in CSV files are that there is no explicit specification of the format and errors can occur when the format is transferred manually from previous lines. Typical examples are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Inconsistent use of inverted commas for string fields<\/li>\n\n\n\n<li>Different localisation (e.g. dot or comma as decimal separator)<\/li>\n\n\n\n<li>Empty columns and different numbers of columns per row<\/li>\n\n\n\n<li>Different string representation of timestamps, date and time fields<\/li>\n\n\n\n<li>Numerical values under inverted commas<\/li>\n\n\n\n<li>Fluctuating accuracy for numeric entries from integer to double<\/li>\n<\/ul>\n\n\n\n<p>Deviations in the data representation can occur not only due to human errors during manual data entry, but also due to process changes during automated data generation. Especially with CSV and JSON files, it is often difficult to determine the type of a data entry, especially if the source data is not consistently filled. The same error categories can occur here as with manual data transfer.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<figure class=\"wp-block-image size-large is-style-rounded\"><img decoding=\"async\" width=\"1024\" height=\"683\" sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930-1024x683.jpg\" alt=\"Programming\" class=\"wp-image-1292\" srcset=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930-1024x683.jpg 1024w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930-300x200.jpg 300w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930-768x512.jpg 768w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930-1536x1025.jpg 1536w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930.jpg 1920w\" \/><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Categories of data errors in structured data<\/h3>\n\n\n\n<p>Depending on the type of structured data, different categories of data errors can occur:<\/p>\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_736abd7a1a5a50d399ed4670913e9a1e\">\n    <h3 class=\" inline-block \">\n        Violation of the data syntax    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_736abd7a1a5a50d399ed4670913e9a1e\" class=\"collapse\" aria-labelledby=\"headingblock_736abd7a1a5a50d399ed4670913e9a1e\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <p>This error category takes on a special role compared to the following ones, as it can normally only occur in structured text files, since binary files are almost without exception generated algorithmically, and are thus normally syntactically correct.<\/p>\n<p>A syntax error occurs when the text file does not follow the specified syntax of the required file format. Examples of this are:<\/p>\n<ul>\n<li>missing closing tags in XML or HTML files<\/li>\n<li>wrong number of columns in a CSV file<\/li>\n<li>wrong format of a date or time stamp in a text file<\/li>\n<li>missing, excess or incorrect inverted commas<\/li>\n<li>Incorrect localisation such as comma instead of dot as decimal separator<\/li>\n<\/ul>\n    <\/div>\n  <\/div>\n<\/div>\n\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_9adfa755be91871f2dcfd59ae182c9ba\">\n    <h3 class=\" inline-block \">\n        Wrong data types    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_9adfa755be91871f2dcfd59ae182c9ba\" class=\"collapse\" aria-labelledby=\"headingblock_9adfa755be91871f2dcfd59ae182c9ba\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <div id=\"datenfehlern\" class=\"vc_row wpb_row vc_row-fluid wpb_padding\" data-sectionid=\"datenfehlern\">\n<div class=\"wpb_column vc_column_container vc_col-sm-12\">\n<div class=\"vc_column-inner \">\n<div class=\"wpb_wrapper\">\n<div class=\"vc_tta-container\" data-vc-action=\"collapse\">\n<div class=\"vc_general vc_tta vc_tta-tabs vc_tta-color-grey vc_tta-style-classic vc_tta-shape-rounded vc_tta-spacing-1 vc_tta-tabs-position-left vc_tta-controls-align-left \">\n<div class=\"vc_tta-panels-container\">\n<div class=\"vc_tta-panels\">\n<div id=\"1686660628303-3f49ef8c-bb74\" class=\"vc_tta-panel vc_active\" data-vc-content=\".vc_tta-panel-body\">\n<div class=\"vc_tta-panel-body\">\n<div class=\"wpb_text_column wpb_content_element \">\n<div class=\"wpb_wrapper\">\n<p>This error occurs if a field to be validated has an incorrect data type. Typical examples are:<\/p>\n<ul>\n<li>Text in a field where numeric values are expected.<\/li>\n<li>Specification to less strict data types in binary formats, for example, the definition of a text field in which a floating point value is semantically expected.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"1686660888599-c66619f3-5466\" class=\"vc_tta-panel\" data-vc-content=\".vc_tta-panel-body\"><\/div>\n<div id=\"1686660889642-62e0e0d9-532c\" class=\"vc_tta-panel\" data-vc-content=\".vc_tta-panel-body\"><\/div>\n<div id=\"1686660890531-1a22e92f-68fe\" class=\"vc_tta-panel\" data-vc-content=\".vc_tta-panel-body\"><\/div>\n<div id=\"1686660893647-ad28bd8e-aa2b\" class=\"vc_tta-panel\" data-vc-content=\".vc_tta-panel-body\"><\/div>\n<div id=\"1686660894376-d0f420b7-2048\" class=\"vc_tta-panel\" data-vc-content=\".vc_tta-panel-body\"><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"beispieldatensatz\" class=\"vc_row wpb_row vc_row-fluid wpb_padding\" data-sectionid=\"beispieldatensatz\">\n<div class=\"wpb_column vc_column_container vc_col-sm-12\">\n<div class=\"vc_column-inner \">\n<div class=\"wpb_wrapper\">\n<div class=\"wdc-heading style-1 text-left\"><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n    <\/div>\n  <\/div>\n<\/div>\n\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_97fc767279d8355c952360fab3921c3b\">\n    <h3 class=\" inline-block \">\n        Missing data    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_97fc767279d8355c952360fab3921c3b\" class=\"collapse\" aria-labelledby=\"headingblock_97fc767279d8355c952360fab3921c3b\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <p>Data schemas often allow data fields to be marked as optional, so it is possible for data fields to remain empty even in structured binaries. However, these are necessary for the application semantics that are based on them, such as fields that are to be used as foreign keys for linking tables. If data is to be read in from a structured text data format, it happens much more often than with binary data that data is missing. A classic example of this is a missing column in a CSV file.<\/p>\n    <\/div>\n  <\/div>\n<\/div>\n\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_78f0e62bee68aab72a115b514840cce3\">\n    <h3 class=\" inline-block \">\n        Missing meta information    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_78f0e62bee68aab72a115b514840cce3\" class=\"collapse\" aria-labelledby=\"headingblock_78f0e62bee68aab72a115b514840cce3\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <p>A typical example of missing meta-information is the specification of a time without a time zone. Storing local time without identifying the time zone can even lead to data loss, depending on the type of storage, because it results in duplicated time stamps for different points in time when switching from summer to winter time. Another example of missing meta-information is a data type specification if it cannot be clearly derived from the data element.<\/p>\n<p>An example of this is the following representation in a CSV file:<\/p>\n<table border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td class=\"code\">\n<div class=\"container\" title=\"Hint: double-click to select code\">\n<div class=\"line number1 index0 alt2\"><code class=\"java plain\">... ;<\/code><code class=\"java value\">10.3352<\/code><code class=\"java plain\">; ...<\/code><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Such a field is usually interpreted and stored as a floating point value. However, there are different data types for single or double precision. Which data type should be chosen depends on which value ranges (e.g. minimum and maximum values) are contained in the total amount of data. If all data is already available, this can be derived programmatically. If, however, the data is only delivered gradually, it is safer to decide on the data type with the larger value range. The disadvantage here is, of course, the double memory requirement for the data field.<\/p>\n    <\/div>\n  <\/div>\n<\/div>\n\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_3e5fadf9cacb131e5f9ef0b11d46ab2a\">\n    <h3 class=\" inline-block \">\n        Violation of the semantic scope    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_3e5fadf9cacb131e5f9ef0b11d46ab2a\" class=\"collapse\" aria-labelledby=\"headingblock_3e5fadf9cacb131e5f9ef0b11d46ab2a\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <p>These errors describe values that are outside their range of validity, although they have a valid value for their data type. An example of this are outdoor temperatures of over 100\u00b0 Celsius in Central Europe.<\/p>\n    <\/div>\n  <\/div>\n<\/div>\n\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_4f0befbdf8a22304592b7d2a84fe9a5d\">\n    <h3 class=\" inline-block \">\n        Wrong order    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_4f0befbdf8a22304592b7d2a84fe9a5d\" class=\"collapse\" aria-labelledby=\"headingblock_4f0befbdf8a22304592b7d2a84fe9a5d\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <p>This error category describes the storage of data in the wrong order. This can be, for example, a time series of sensor values that has not been stored sorted in ascending order by timestamp. As long as the timestamp is available for each value, such a data set can still be read in correctly, but often timestamps are not explicitly stored to save storage space if the sensor values were determined by regular sampling. In such a case, the start time and the time interval between two measuring points are sufficient for determining all timestamps \u2013 if the storage sequence is correct.<\/p>\n<p>Another example where the order of storage is critical for semantics is the sequential storage of measurement data points in a text file, which are arranged on a regular grid, and whose positioning on the grid results implicitly from start position and step size along the axes of the coordinate system.<\/p>\n    <\/div>\n  <\/div>\n<\/div>\n\n\n\n<div class=\"accordion\">\n  <div class=\"accordion-header p-1.5 md:px-3 md:py-2 flex items-center justify-between \" id=\"headingblock_9c97beddcdc43578156d20c2a0e6312d\">\n    <h3 class=\" inline-block \">\n        Format changes for continuously supplied data    <\/h3>\n    <span class=\"accordion-icon-toggle inline-block\"><\/span>\n  <\/div>\n  <div id=\"collapseblock_9c97beddcdc43578156d20c2a0e6312d\" class=\"collapse\" aria-labelledby=\"headingblock_9c97beddcdc43578156d20c2a0e6312d\">\n    <div class=\"accordion-body p-1.5 md:p-3 \">\n      <p>In practice, this is one of the biggest data quality problems. If data are delivered in a consistent format, the processing of the data can be adapted to this format and also absorb certain recurring fluctuations in data quality. However, when there is an abrupt change in the format of the delivered data, data processing usually needs to be adapted. Typically, this involves data fields that are dropped or added, changes in data types or in the data format. From a data quality perspective, there is basically no distinction between data that is delivered in a block and has a non-uniform format and data that is delivered over time as a data stream and changes format over time. The difference for the data recipient, however, is that if data errors already exist, one can adjust the data import to them right away, whereas changes often happen unexpectedly with continuous data delivery.<\/p>\n    <\/div>\n  <\/div>\n<\/div>\n\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Sample data set<\/h3>\n\n\n\n<p>Typical structured data from the industrial environment are time series of sensor data. The time series of measurement data from a heat engine can serve as an example here, which is shown in excerpts. These data were taken directly from the operation of the machine via sensors and stored in the CSV file by a programme running on a Raspberry Pi mini-computer, which can be seen as quite representative of industrial data in terms of data quality.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><code>timestamp;temperature_heater;temperature_boiler;pressure_boiler;rpm;power_dynamo;power_heating;valve_aperture;water_level<\/code><br><code>2019-07-20T11:38:03;26.093750;48.555557;193.544373;0.000000;-0.001262;0.0;0.0;190<\/code><br><code>2019-07-20T11:38:04;26.093750;48.555557;180.865280;0.000000;-0.001262;0.0;0.0;190<\/code><br><code>2019-07-20T11:38:05;26.093750;47.416672;193.544373;0.000000;-0.001262;0.0;0.0;190<\/code><br><code>...<\/code><br><code>2019-07-20T11:38:58;26.093750;48.555557;206.114639;0.000000;-0.001262;0.0;0.0;190<\/code><br><code>2019-07-20T11:38:59;26.093750;47.416672;206.114639;0.000000;-0.001262;0.0;0.0;190<\/code><br><code>2019-07-20T11:39:00;26.093750;48.555557;206.114639;12.000000;-0.001262;446.973846;0.0;190<\/code><br><code>2019-07-20T11:39:01;26.093750;49.694443;193.489960;12.000000;-0.001262;442.720520;0.0;190<\/code><br><code>2019-07-20T11:39:02;26.093750;50.833328;206.060226;0.000000;-0.001262;446.973846;0.0;190<\/code><br><code>2019-07-20T11:39:03;26.093750;49.694443;193.435562;0.000000;-0.001262;446.973846;0.0;190<\/code><br><code>...<\/code><br><code>2019-07-20T11:46:09;35.774303;279.750000;1494.212524;0.000000;-0.006040;459.733795;0.0;190<\/code><br><code>2019-07-20T11:46:10;35.774303;276.333313;1494.212524;0.000000;-0.006702;459.733795;0.25;190<\/code><br><code>2019-07-20T11:46:11;35.774303;279.750000;1519.461914;0.000000;-0.006702;459.733795;0.25;<\/code><br><code>2019-07-20T11:46:12;35.774303;279.750000;1519.516235;0.000000;-0.006702;459.733795;0.25;<\/code><br><code>...<\/code><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In this CSV file, some of the data errors shown above are evident:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The timestamp in the first column does not include a time zone<\/li>\n\n\n\n<li>Negative values are shown in the <code>power_dynamo<\/code> column<\/li>\n\n\n\n<li>In the last two lines shown, the value for <code>water_level<\/code> is missing<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<h3 class=\"wp-block-heading\">Methods of data troubleshooting<\/h3>\n\n\n\n<p>An obvious &#8211; and good &#8211; approach to resolving data quality deficiencies is to request an improved version of the data from the data provider. However, this approach is often not feasible in practice. For example, if faulty sensor data has been recorded in a production line because a sensor is defective, it is often not possible, or at least very cost-intensive, to repeat the data recording. While the real monetary value of the collected data is often not assessable for the project participants at the beginning, the directly incurred costs for a repetition of a measurement &#8211; e.g. due to a production interruption &#8211; can be quantified very quickly. Furthermore, the replacement of a defective sensor can also lead to a considerable delay of the planned data analyses due to the often necessary involvement of external companies. A typical scenario here is the collection of sufficient training data for machine learning models, which can often take months. Here, a delay of possibly several weeks due to the replacement of a sensor can jeopardise the entire project schedule without the real benefit of such an intervention being clear in advance.<\/p>\n\n\n\n<p>For these reasons, algorithmic handling of the data error is often the most favourable solution overall.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<figure class=\"wp-block-image size-large is-style-rounded\"><img decoding=\"async\" width=\"1024\" height=\"683\" sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1170740969-1024x683.jpg\" alt=\"Prescriptive Analytics\" class=\"wp-image-1401\" srcset=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1170740969-1024x683.jpg 1024w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1170740969-300x200.jpg 300w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1170740969-768x512.jpg 768w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1170740969-1536x1025.jpg 1536w, https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1170740969.jpg 1920w\" \/><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Algorithmic data error recovery<\/h3>\n\n\n\n<p>Unfortunately, there are no generally applicable methods for always bringing source data into the desired target format. In general, however, it can be said that the comprehensive availability of meta information or the use of a structured binary format for the source data greatly reduces the effort required for data validation. The desired type of a data field is already known and therefore the data cannot be stored incorrectly. Thus, in the case of binary data, the most common error is missing data if the underlying schema has changed or a data field has been specified as optional, although it is needed for the application logic. In general, it can be said that data errors in structured binary data can usually be traced back to errors in the data schema or to an unplanned change to it.<\/p>\n\n\n\n<p>If structured text data are used as data sources, additional classes of possible errors are added &#8211; as described above.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Explicit schema information in text data<\/h4>\n\n\n\n<p>In the case of structured text files, however, schema information can be included by convention, such as in the header line of the CSV file, which can include the names of the columns. As an extension of this common methodology, one can extend the header line in the data type information to ensure that the correct target data type is used:<\/p>\n\n\n\n<figure class=\"wp-block-table aligncenter\"><table><tbody><tr><td><code>timestamp:java.sql.Timestamp;pressure_boiler:java.lang.Double;rpm:java.lang.Double;valve_aperture:java.lang.Double;water_level:java.lang.Integer<\/code><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In the above example, the corresponding Java data types are specified, whereby it naturally depends on the target data storage system which data types are available for storage. Normally, however, it is sufficient to uniquely define the data type for a database or programming language, because then the conversion for other target systems can be done automatically.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Automated data error recovery<\/h4>\n\n\n\n<p>How can one react to data errors in the context of an automated process? If data is to be stored as part of an Extract &#8211; Transform &#8211; Load (ETL) process using a specific schema and a data set does not meet the requirements of the data schema, the simplest method is to discard that data set. This may be appropriate for some use cases &#8211; such as large datasets for AI model training &#8211; but in general the goal is to transform a dataset so that it can be stored in the intended schema. The following methods can be used to do this automatically:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Schema evolution or optional type fields<\/strong>: Schema evolution describes the possibility of versioning a schema, whereby data stored with an earlier version of the data schema remains processable with the new schema. A schema evolution can include adding, removing and type conversion of data fields. A good tool for this is optional type fields, which make it possible, for example, to add new fields and still process existing old data correctly. Optional data fields are also a good way to store empty data fields correctly without having to discard the entire record.<\/li>\n\n\n\n<li><strong>Implicit type conversion<\/strong>: If a source data type can be automatically converted to the target data type without loss of accuracy, this can be done automatically in the ETL process:<\/li>\n\n\n\n<li><strong>Data interpolation for missing values in time series<\/strong>: This is an obvious operation, but it is highly dependent on the intended use of the data whether such an operation is permissible.<\/li>\n<\/ul>\n\n\n\n<p>If data errors cannot be corrected automatically, it is a good idea to keep the raw data and send a notification, for example, so that the error can be investigated and the ETL process completed &#8211; if necessary after manual correction. If it is not a one-off error, the ETL process is usually adjusted in the course of investigating and correcting the problem in order to eliminate the error in the future. This applies in particular to errors resulting from a change in the data source format.<\/p>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Data loss prevention<\/h3>\n\n\n\n<p>If the source data is delivered continuously via a streaming process, it is particularly important to save the raw data first before it goes into further processing. This can prevent the data from being lost if data processing fails at a later point in the ETL process. After an error has been corrected or the ETL process has been adjusted after a format change, the stored raw data can be reprocessed. Most of the time, the raw data consumes more storage space, especially if it is delivered as text files, compared to a later structured and compressed storage. Therefore, it is often advisable to delete the successfully processed raw data after validation. To save storage space, raw data can of course also be compressed using standard algorithms, which leads to significant storage space savings, especially with text data.<\/p>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">Role of data quality in project planning<\/h3>\n\n\n\n<p>Before the start of a data science or data engineering project, it is often difficult for all those involved to estimate the quality of the data to be included. This is often because the data has already been collected over a certain period of time but has not yet been used in operations, for example because it is not yet available in sufficient quantity.<\/p>\n\n\n\n<p>Furthermore, raising data quality to a level required for the project goals often represents a considerable share of the project effort, which is difficult to estimate without knowledge of the data or its quality. To address this problem, it is possible, for example, to include a pre-project phase to jointly clarify the initial situation, or to choose an agile approach that enables a step-by-step joint procedure with flexible definition of milestones.<\/p>\n\n\n\n<p><em>With its expertise in the field of data engineering built up over more than ten years, RISC Software GmbH represents a reliable consulting and implementation partner, regardless of the area of application.<\/em><\/p>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignfull is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<h3 class=\"wp-block-heading\">Contact<\/h3>\n\n\n\n<div class=\"wp-block-contact-form-7-contact-form-selector\">\n<div class=\"wpcf7 no-js\" id=\"wpcf7-f663-o1\" lang=\"en-US\" dir=\"ltr\" data-wpcf7-id=\"663\">\n<div class=\"screen-reader-response\"><p role=\"status\" aria-live=\"polite\" aria-atomic=\"true\"><\/p> <ul><\/ul><\/div>\n<form action=\"\/en\/wp-json\/wp\/v2\/publication\/3712#wpcf7-f663-o1\" method=\"post\" class=\"wpcf7-form init\" aria-label=\"Contact form\" novalidate=\"novalidate\" data-status=\"init\">\n<fieldset class=\"hidden-fields-container\"><input type=\"hidden\" name=\"_wpcf7\" value=\"663\" \/><input type=\"hidden\" name=\"_wpcf7_version\" value=\"6.1.5\" \/><input type=\"hidden\" name=\"_wpcf7_locale\" value=\"en_US\" \/><input type=\"hidden\" name=\"_wpcf7_unit_tag\" value=\"wpcf7-f663-o1\" \/><input type=\"hidden\" name=\"_wpcf7_container_post\" value=\"0\" \/><input type=\"hidden\" name=\"_wpcf7_posted_data_hash\" value=\"\" \/>\n<\/fieldset>\n<div class=\"form-row\">\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-name\">Your name <\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-name\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text wpcf7-validates-as-required\" id=\"your-name\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Name\" value=\"\" type=\"text\" name=\"your-name\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-email\">Your email<\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-email\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-email wpcf7-validates-as-required wpcf7-text wpcf7-validates-as-email\" id=\"your-email\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"E-Mail\" value=\"\" type=\"email\" name=\"your-email\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n<\/div>\n<div class=\"form-row\">\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-company\">Company <\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-company\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text\" id=\"your-company\" aria-invalid=\"false\" placeholder=\"Unternehmen\" value=\"\" type=\"text\" name=\"your-company\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-position\">Position<\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-position\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text\" aria-invalid=\"false\" placeholder=\"Position\" value=\"\" type=\"text\" name=\"your-position\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n<\/div>\n<div class=\"form-row\">\n\t<div class=\"form-input\">\n\t\t<p><label class=\"sr-only\" for=\"your-subject\"> Subject <\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-subject\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text wpcf7-validates-as-required\" id=\"your-subject\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Thema\" value=\"\" type=\"text\" name=\"your-subject\" \/><\/span>\n\t\t<\/p>\n\t<\/div>\n<\/div>\n<p><span id=\"wpcf7-69de271114bd4-wrapper\" class=\"wpcf7-form-control-wrap phone-95-wrap\" style=\"display:none !important; visibility:hidden !important;\"><label for=\"wpcf7-69de271114bd4-field\" class=\"hp-message\">Please leave this field empty.<\/label><input id=\"wpcf7-69de271114bd4-field\"  class=\"wpcf7-form-control wpcf7-text\" type=\"text\" name=\"phone-95\" value=\"\" size=\"40\" tabindex=\"-1\" autocomplete=\"new-password\" \/><\/span><br \/>\n<label class=\"sr-only\" for=\"your-message\"> Your message (optional)<\/label><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"your-message\"><textarea cols=\"40\" rows=\"10\" maxlength=\"2000\" class=\"wpcf7-form-control wpcf7-textarea\" id=\"your-message\" aria-invalid=\"false\" placeholder=\"Ihre Nachricht an uns\" name=\"your-message\"><\/textarea><\/span><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"hcap-cf7\">\t\t<input\n\t\t\t\ttype=\"hidden\"\n\t\t\t\tclass=\"hcaptcha-widget-id\"\n\t\t\t\tname=\"hcaptcha-widget-id\"\n\t\t\t\tvalue=\"eyJzb3VyY2UiOlsiY29udGFjdC1mb3JtLTdcL3dwLWNvbnRhY3QtZm9ybS03LnBocCJdLCJmb3JtX2lkIjo2NjN9-5cf29316f0fc31f5a29d11a228757560\">\n\t\t\t\t<span id=\"hcap_cf7-69de2711153106.45729528\" class=\"wpcf7-form-control h-captcha \"\n\t\t\tdata-sitekey=\"3a6a81c1-2b2e-4b2a-b1eb-d9446bc09afb\"\n\t\t\tdata-theme=\"light\"\n\t\t\tdata-size=\"normal\"\n\t\t\tdata-auto=\"false\"\n\t\t\tdata-ajax=\"false\"\n\t\t\tdata-force=\"false\">\n\t\t<\/span>\n\t\t<input type=\"hidden\" id=\"_wpnonce\" name=\"_wpnonce\" value=\"83fc09d92f\" \/><input type=\"hidden\" name=\"_wp_http_referer\" value=\"\/en\/wp-json\/wp\/v2\/publication\/3712\" \/><\/span><input class=\"wpcf7-form-control wpcf7-submit has-spinner btn\" type=\"submit\" value=\"Senden\" \/>\n<\/p><div class=\"wpcf7-response-output\" aria-hidden=\"true\"><\/div>\n<\/form>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<h3 class=\"wp-block-heading\">Author<\/h3>\n\n\n<div class=\"contact-person\">\n      <picture>\n      \n      \n      \n      \n      <img decoding=\"async\" data-aos=\"fade-zoom-in\"\n           data-aos-offset=\"0\" class=\"w-full\" width=\"212\" height=\"293\"\n           src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/pheinzlr1-removebg-preview.png\"\n           alt=\"\">\n    <\/picture>\n    \n\n<h5 class=\"wp-block-heading\">DI Paul Heinzlreiter<\/h5>\n\n\n\n<p>Senior Data Engineer<\/p>\n\n  <\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n<div class=\"wp-block-group-container alignfull \">\n<div class=\"wp-block-group alignwide is-layout-constrained wp-block-group-is-layout-constrained\"><div class=\"posts-slider-block\" data-aos=\"fade-up\" data-aos-offset=\"0\" data-aos-anchor-placement=\"top-bottom\">\n        <section class=\"splide posts-slider\" aria-label=\"Gallery Slides\">\n            <div class=\"splide__arrows\">\n                <button class=\"splide__arrow splide__arrow--prev\">\n                    <span class=\"sr-only\">Previous<\/span>\n                    <img decoding=\"async\" loading=\"lazy\" width=\"25\" height=\"21\" src=\"https:\/\/risc.web-email.at\/app\/themes\/risc-theme\/public\/images\/icon-arrow.35d2ec.svg\"\n                         alt=\"Previous\">\n                <\/button>\n                <button class=\"splide__arrow splide__arrow--next\">\n                    <span class=\"sr-only\">Next<\/span>\n                    <img decoding=\"async\" loading=\"lazy\" width=\"25\" height=\"21\" src=\"https:\/\/risc.web-email.at\/app\/themes\/risc-theme\/public\/images\/icon-arrow.35d2ec.svg\"\n                         alt=\"Next\">\n                <\/button>\n            <\/div>\n            <div class=\"inner\">\n                <div class=\"splide__track\">\n                    <div class=\"splide__list\">\n\n                                                    <a href=\"https:\/\/risc.web-email.at\/en\/technicalarticles\/technical-article-data-quality\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1200230502-360x214.jpg\"\n                                                 alt=\"Data quality: From information flow to information content\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Data quality: From information flow to information content<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Making decisions is not always easy &#8211; a possible quick win for companies can be derived from company data: future-oriented data quality management, a process that unfortunately often receives far too little attention.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                                    <a href=\"https:\/\/risc.web-email.at\/en\/technicalarticles\/technical-article-methods-and-tools-for-data-preparation-in-the-big-data-area\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-1216520813-1-360x214.jpg\"\n                                                 alt=\"Methods and tools for data preparation in the big data area\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Methods and tools for data preparation in the big data area<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Die Rolle von Big Data hat in zahlreichen Wirtschaftsbereichen stark an Bedeutung gewonnen. Es werden gro\u00dfe Datenmengen gesammelt, die zur Optimierung herangezogen werden k\u00f6nnen. Hierbei kommen oft Methoden aus den Bereichen k\u00fcnstliche Intelligenz, maschinelles Lernen oder Statistik zum Einsatz.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                                    <a href=\"https:\/\/risc.web-email.at\/en\/technicalarticles\/time-series-analysis-but-correct\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-530605395-1-360x214.jpg\"\n                                                 alt=\"Time series analysis - but correct!\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Time series analysis &#8211; but correct!<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Time series data, for example machine data in industry or vital signs in medicine, are nowadays an important data source for the analysis of complex systems. However, for the development of practical models, the right choice of training data is a challenging task.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                                    <a href=\"https:\/\/risc.web-email.at\/en\/technicalarticles\/technical-article-data-engineering-the-solid-basis-for-effective-data-utilization\/\" class=\"splide__slide blog-post-teaser mb-1 lg:mb-3\">\n                                <div class=\"blog-image\">\n                                                                                                                                <picture>\n                                                                                        <img decoding=\"async\" src=\"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-966899060-1-360x214.jpg\"\n                                                 alt=\"Data Engineering \u2013 the solid basis for effective data utilization\">\n                                        <\/picture>\n                                                                    <\/div>\n                                <div class=\"blog-content px-2 py-3 xl:px-4 xl:py-5\">\n                                    <h3>Data Engineering \u2013 the solid basis for effective data utilization<\/h3>\n                                    <div class=\"blog-post-excerpt mt-2\">\n                                        Data engineering integrates data from a wide variety of sources and makes them effectively usable. This makes it a prerequisite for effective data science, machine learning and artificial intelligence, especially in the big data area.\n                                    <\/div>\n                                    <span class=\"inline-block mt-2 more\">mehr erfahren <span class=\"ml-1 icon-more\"><\/span><\/span>\n\n                                <\/div>\n                            <\/a>\n                                            <\/div>\n                <\/div>\n            <\/div>\n        <\/section>\n    <\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>One of the central goals of data engineering is the preparation of data sets according to the requirements of the users or the subsequent process steps. The use of data can range from model training in the field of machine learning to improved internal company reporting based on an integrated database.<\/p>\n","protected":false},"featured_media":1293,"template":"","publication-category":[50],"class_list":["post-3712","publication","type-publication","status-publish","has-post-thumbnail","hentry","publication-category-data-science-and-a-i"],"acf":[],"portrait_thumb_url":"https:\/\/risc.web-email.at\/app\/uploads\/2023\/06\/iStock-494345930-360x214.jpg","_links":{"self":[{"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/publication\/3712","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/publication"}],"about":[{"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/types\/publication"}],"version-history":[{"count":19,"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/publication\/3712\/revisions"}],"predecessor-version":[{"id":5119,"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/publication\/3712\/revisions\/5119"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/media\/1293"}],"wp:attachment":[{"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/media?parent=3712"}],"wp:term":[{"taxonomy":"publication-category","embeddable":true,"href":"https:\/\/risc.web-email.at\/en\/wp-json\/wp\/v2\/publication-category?post=3712"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}