Remap with VRL

Modify your observability data as it passes through your topology using Vector Remap Language (VRL)

status: beta egress: stream state: stateless

Configuration

Example configurations

{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". = parse_json!(.message)\n.new_field = \"new value\"\n.status = to_int!(.status)\n.duration = parse_duration!(.duration, \"s\")\n.new_name = del(.old_name)"
    }
  }
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
. = parse_json!(.message)
.new_field = "new value"
.status = to_int!(.status)
.duration = parse_duration!(.duration, "s")
.new_name = del(.old_name)"""
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: |-
      . = parse_json!(.message)
      .new_field = "new value"
      .status = to_int!(.status)
      .duration = parse_duration!(.duration, "s")
      .new_name = del(.old_name)      
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". = parse_json!(.message)\n.new_field = \"new value\"\n.status = to_int!(.status)\n.duration = parse_duration!(.duration, \"s\")\n.new_name = del(.old_name)",
      "drop_on_error": null,
      "drop_on_abort": true
    }
  }
}
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
. = parse_json!(.message)
.new_field = "new value"
.status = to_int!(.status)
.duration = parse_duration!(.duration, "s")
.new_name = del(.old_name)"""
drop_on_abort = true
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: |-
      . = parse_json!(.message)
      .new_field = "new value"
      .status = to_int!(.status)
      .duration = parse_duration!(.duration, "s")
      .new_name = del(.old_name)      
    drop_on_error: null
    drop_on_abort: true

drop_on_abort

optional bool
Drop the event if the VRL program is manually aborted through the abort statement.
default: true

drop_on_error

optional bool
Drop the event if the VRL program returns an error at runtime.
default: false

inputs

required [string]

A list of upstream source or transform IDs. Wildcards (*) are supported but must be the last character in the ID.

See configuration for more info.

Array string literal
Examples
[
  "my-source-or-transform-id",
  "prefix-*"
]

source

required string
The Vector Remap Language (VRL) program to execute for each event.

Telemetry

Metrics

link

events_in_total

counter
The number of events accepted by this component either from tagged origin like file and uri, or cumulatively from other origins.
component_kind required
The Vector component kind.
component_name required
The Vector component name.
component_type required
The Vector component type.
container_name optional
The name of the container from which the event originates.
file optional
The file from which the event originates.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the event originates.
peer_path optional
The pathname from which the event originates.
pod_name optional
The name of the pod from which the event originates.
uri optional
The sanitized URI from which the event originates.

events_out_total

counter
The total number of events emitted by this component.
component_kind required
The Vector component kind.
component_name required
The Vector component name.
component_type required
The Vector component type.

processed_bytes_total

counter
The number of bytes processed by the component.
component_kind required
The Vector component kind.
component_name required
The Vector component name.
component_type required
The Vector component type.
container_name optional
The name of the container from which the bytes originate.
file optional
The file from which the bytes originate.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the bytes originate.
peer_path optional
The pathname from which the bytes originate.
pod_name optional
The name of the pod from which the bytes originate.
uri optional
The sanitized URI from which the bytes originate.

processed_events_total

counter
The total number of events processed by this component. This metric is deprecated in place of using events_in_total and events_out_total metrics.
component_kind required
The Vector component kind.
component_name required
The Vector component name.
component_type required
The Vector component type.

processing_errors_total

counter
The total number of processing errors encountered by this component.
component_kind required
The Vector component kind.
component_name required
The Vector component name.
component_type required
The Vector component type.
error_type required
The type of the error

Examples

Parse Syslog logs

Given this event...
{
  "log": {
    "message": "\u003c102\u003e1 2020-12-22T15:22:31.111Z vector-user.biz su 2666 ID389 - Something went wrong"
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". |= parse_syslog!(.message)"
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: . |= parse_syslog!(.message)
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". |= parse_syslog!(.message)"
    }
  }
}
...this Vector event is produced:
{
  "log": {
    "appname": "su",
    "facility": "ntp",
    "hostname": "vector-user.biz",
    "message": "Something went wrong",
    "msgid": "ID389",
    "procid": 2666,
    "severity": "info",
    "timestamp": "2020-12-22T15:22:31.111Z",
    "version": 1
  }
}

Parse key/value (logfmt) logs

Given this event...
{
  "log": {
    "message": "@timestamp=\"Sun Jan 10 16:47:39 EST 2021\" level=info msg=\"Stopping all fetchers\" tag#production=stopping_fetchers id=ConsumerFetcherManager-1382721708341 module=kafka.consumer.ConsumerFetcherManager"
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". = parse_key_value!(.message)"
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: . = parse_key_value!(.message)
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". = parse_key_value!(.message)"
    }
  }
}
...this Vector event is produced:
{
  "log": {
    "@timestamp": "Sun Jan 10 16:47:39 EST 2021",
    "id": "ConsumerFetcherManager-1382721708341",
    "level": "info",
    "module": "kafka.consumer.ConsumerFetcherManager",
    "msg": "Stopping all fetchers",
    "tag#production": "stopping_fetchers"
  }
}

Parse custom logs

Given this event...
{
  "log": {
    "message": "2021/01/20 06:39:15 +0000 [error] 17755#17755: *3569904 open() \"/usr/share/nginx/html/test.php\" failed (2: No such file or directory), client: xxx.xxx.xxx.xxx, server: localhost, request: \"GET /test.php HTTP/1.1\", host: \"yyy.yyy.yyy.yyy\""
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
. |= parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+ \\+\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')

# Coerce parsed fields
.timestamp = parse_timestamp(.timestamp, "%Y/%m/%d %H:%M:%S %z") ?? now()
.pid = to_int!(.pid)
.tid = to_int!(.tid)

# Extract structured data
message_parts = split(.message, ", ", limit: 2)
structured = parse_key_value(message_parts[1], key_value_delimiter: ":", field_delimiter: ",") ?? {}
.message = message_parts[0]
. = merge(., structured)"""
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: >-
      . |= parse_regex!(.message, r'^(?P<timestamp>\d+/\d+/\d+ \d+:\d+:\d+
      \+\d+) \[(?P<severity>\w+)\] (?P<pid>\d+)#(?P<tid>\d+):(?:
      \*(?P<connid>\d+))? (?P<message>.*)$')


      # Coerce parsed fields

      .timestamp = parse_timestamp(.timestamp, "%Y/%m/%d %H:%M:%S %z") ?? now()

      .pid = to_int!(.pid)

      .tid = to_int!(.tid)


      # Extract structured data

      message_parts = split(.message, ", ", limit: 2)

      structured = parse_key_value(message_parts[1], key_value_delimiter: ":", field_delimiter: ",") ?? {}

      .message = message_parts[0]

      . = merge(., structured)      
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". |= parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+ \\+\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')\n\n# Coerce parsed fields\n.timestamp = parse_timestamp(.timestamp, \"%Y/%m/%d %H:%M:%S %z\") ?? now()\n.pid = to_int!(.pid)\n.tid = to_int!(.tid)\n\n# Extract structured data\nmessage_parts = split(.message, \", \", limit: 2)\nstructured = parse_key_value(message_parts[1], key_value_delimiter: \":\", field_delimiter: \",\") ?? {}\n.message = message_parts[0]\n. = merge(., structured)"
    }
  }
}
...this Vector event is produced:
{
  "log": {
    "client": "xxx.xxx.xxx.xxx",
    "connid": "3569904",
    "host": "yyy.yyy.yyy.yyy",
    "message": "open() \"/usr/share/nginx/html/test.php\" failed (2: No such file or directory)",
    "pid": 17755,
    "request": "GET /test.php HTTP/1.1",
    "server": "localhost",
    "severity": "error",
    "tid": 17755,
    "timestamp": "2021-01-20T06:39:15Z"
  }
}

Multiple parsing strategies

Given this event...
{
  "log": {
    "message": "\u003c102\u003e1 2020-12-22T15:22:31.111Z vector-user.biz su 2666 ID389 - Something went wrong"
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
structured =
  parse_syslog(.message) ??
  parse_common_log(.message) ??
  parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')
. = merge(., structured)"""
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: >-
      structured =
        parse_syslog(.message) ??
        parse_common_log(.message) ??
        parse_regex!(.message, r'^(?P<timestamp>\d+/\d+/\d+ \d+:\d+:\d+) \[(?P<severity>\w+)\] (?P<pid>\d+)#(?P<tid>\d+):(?: \*(?P<connid>\d+))? (?P<message>.*)$')
      . = merge(., structured)      
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": "structured =\n  parse_syslog(.message) ??\n  parse_common_log(.message) ??\n  parse_regex!(.message, r'^(?P<timestamp>\\d+/\\d+/\\d+ \\d+:\\d+:\\d+) \\[(?P<severity>\\w+)\\] (?P<pid>\\d+)#(?P<tid>\\d+):(?: \\*(?P<connid>\\d+))? (?P<message>.*)$')\n. = merge(., structured)"
    }
  }
}
...this Vector event is produced:
{
  "log": {
    "appname": "su",
    "facility": "ntp",
    "hostname": "vector-user.biz",
    "message": "Something went wrong",
    "msgid": "ID389",
    "procid": 2666,
    "severity": "info",
    "timestamp": "2020-12-22T15:22:31.111Z",
    "version": 1
  }
}

Modify metric tags

Given this event...
{
  "metric": {
    "counter": {
      "value": 102
    },
    "kind": "incremental",
    "name": "user_login_total",
    "tags": {
      "email": "vic@vector.dev",
      "host": "my.host.com",
      "instance_id": "abcd1234"
    }
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = """
.tags.environment = get_env_var!("ENV") # add
.tags.hostname = del(.tags.host) # rename
del(.tags.email)"""
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: |-
      .tags.environment = get_env_var!("ENV") # add
      .tags.hostname = del(.tags.host) # rename
      del(.tags.email)      
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ".tags.environment = get_env_var!(\"ENV\") # add\n.tags.hostname = del(.tags.host) # rename\ndel(.tags.email)"
    }
  }
}
...this Vector event is produced:
{
  "metric": {
    "counter": {
      "value": 102
    },
    "kind": "incremental",
    "name": "user_login_total",
    "tags": {
      "environment": "production",
      "hostname": "my.host.com",
      "instance_id": "abcd1234"
    }
  }
}

Emitting multiple logs from JSON

Given this event...
{
  "log": {
    "message": "[{\"message\": \"first_log\"}, {\"message\": \"second_log\"}]"
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". = parse_json!(.message) # sets `.` to an array of objects"
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: ". = parse_json!(.message) # sets `.` to an array of objects"
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". = parse_json!(.message) # sets `.` to an array of objects"
    }
  }
}
...this Vector event is produced:
[
  {
    "log": {
      "message": "first_log"
    }
  },
  {
    "log": {
      "message": "second_log"
    }
  }
]

Emitting multiple non-object logs from JSON

Given this event...
{
  "log": {
    "message": "[5, true, \"hello\"]"
  }
}
...and this configuration...
[transforms.my_transform_id]
type = "remap"
inputs = [ "my-source-or-transform-id" ]
source = ". = parse_json!(.message) # sets `.` to an array"
---
transforms:
  my_transform_id:
    type: remap
    inputs:
      - my-source-or-transform-id
    source: ". = parse_json!(.message) # sets `.` to an array"
{
  "transforms": {
    "my_transform_id": {
      "type": "remap",
      "inputs": [
        "my-source-or-transform-id"
      ],
      "source": ". = parse_json!(.message) # sets `.` to an array"
    }
  }
}
...this Vector event is produced:
[
  {
    "log": {
      "message": 5
    }
  },
  {
    "log": {
      "message": true
    }
  },
  {
    "log": {
      "message": "hello"
    }
  }
]

How it works

Emitting multiple log events

Multiple log events can be emitted from remap by assigning an array to the root path .. One log event is emitted for each input element of the array.

If any of the array elements isn’t an object, a log event is created that uses the element’s value as the message key. For example, 123 is emitted as:

{
  "message": 123
}

Lazy Event Mutation

When you make changes to an event through VRL’s path assignment syntax, the change isn’t immediately applied to the actual event. If the program fails to run to completion, any changes made until that point are dropped and the event is kept in its original state.

If you want to make sure your event is changed as expected, you have to rewrite your program to never fail at runtime (the compiler can help you with this).

Alternatively, if you want to ignore/drop events that caused the program to fail, you can set the drop_on_error configuration value to true.

Learn more about runtime errors in the Vector Remap Language reference.

Vector Remap Language

The Vector Remap Language (VRL) is a restrictive, fast, and safe language we designed specifically for mapping observability data. It avoids the need to chain together many fundamental Vector transforms to accomplish rudimentary reshaping of data.

The intent is to offer the same robustness of full language runtime (ex: Lua) without paying the performance or safety penalty.

Learn more about Vector’s Remap Language in the Vector Remap Language reference.

State

This component is stateless, meaning its behavior is consistent across each input.