String pattern location: instr

Syntax

instr ( op, pattern, start, occurrence )

Input parameters

op

the operand

pattern

the string-pattern to be searched

start

the position in the input string of the character from which the search starts

occurrence

the occurrence of the pattern to search

Examplesof valid syntaxes

instr ( DS_1,  “ab”, 2 , 3 )
instr ( DS_1,  “ab”, 2 )
instr ( DS_1,  “ab”, _ , 2 )
instr ( DS_1,  “ab” )

Semantics for scalar operations

The operator returns the position in the input string of a specified string ( pattern ). The search starts from the start*th character of the input string and finds the nth occurrence of the pattern, returning the position of its first character. If *start is omitted, the search starts from the 1st position. If nth occurrence is omitted, the value is 1. If the nth occurrence of the string-pattern after the start th character is not found in the input string, the returned value is 0. For example:

instr ("abcde", "c" ) gives 3
instr ("abcdecfrxcwsd", "c", _ , 3 ) gives 10
instr ("abcdecfrxcwsd", "c", 5 , 3 ) gives 0

Input parameters type

op

dataset { measure<string> _+ }
| component<string>
| string

pattern

component<string>
| string

start

component < integer [ value >= 1 ] >
| integer [ value >= 1 ]

occurrence

component < integer [ value >= 1 ] >
| integer [ value >= 1 ]

Result type

result

dataset { measure<integer[value >= 0]> int_var }
| component<integer[value >= 0]>
| integer[value >= 0]

Additional Constraints

The second parameter (pattern) cannot be omitted. For operations at Data Set level, the input Data Set must have exactly one string type Measure.

Behaviour

As for the invocations at Data Set level, the operator has the behaviour of the “Operators applicable on one Scalar Value or Data Set or Data Set Component”, as for the invocations at Component or Scalar level, the operator has the behaviour of the “Operators applicable on more than two Scalar Values or Data Set Components”, (see the section “Typical behaviours of the ML Operators”). If op is a Data Set then instr returns a dataset with a single measure int_var of type integer.

Examples

Given the operand datasets DS_1 and DS_2:

Input DS_1 (see structure)

Id_1

Id_2

Me_1

1

A

hello world

2

A

say hello

3

A

he

4

A

hi, hello!

Input DS_2 (see structure)

Id_1

Id_2

Me_1

Me_2

1

A

hello

world

2

B

hi

Example 1

DS_r:= instr(DS_1,"hello");

results in (see structure):

DS_r

Id_1

Id_2

int_var

1

A

1

2

A

5

3

A

0

4

A

5

Example 2

DS_r := DS_1[calc Me_2:=instr(Me_1,"hello")];

results in (see structure):

DS_r

Id_1

Id_2

Me_1

Me_2

1

A

hello world

1

2

A

say hello

5

3

A

he

0

4

A

hi, hello!

5

Example 3

DS_r := DS_1 [calc Me_10:= instr(Me_1, "o" ), Me_20:=instr(Me_2, "o")];

results in (see structure):

DS_r

Id_1

Id_2

Me_1

Me_2

Me_10

Me_20

1

A

hello

world

5

2

2

B

hi

0

Example 4

Applying the instr operator at Data Set level to a multi Measure Data Set:

DS_r := instr(DS_2, “o” ) would give error because DS_2 has more than one Measure.