-

   rss_rss_hh_new

 - e-mail

 

 -

 LiveInternet.ru:
: 17.03.2011
:
:
: 51

:


[ ] c Talend Open Studio CSV

, 20 2017 . 17:35 +
railmisaka 17:35

c Talend Open Studio CSV

, . , , . . Talend Open Studio (TOS) .

TOS . , , . , .

, , .

, TOS Open Source . TOS , .

TOS, , , TOS Java. Java . , Java TOS.
TOS ( ).

Talend (job). , , , . /. . (flow). . (, ) ( ). . TOS . .

, , .

CSV :

id,event_name,event_datetime,tag
1,"Hello, world!",2017-01-10T18:00:00Z,
2,"Event2",2017-01-10T19:00:00Z,tag1=q
3,Event3,2017-01-10T20:00:00Z,
4,"Hello, world!",2017-01-10T21:00:00Z,tag2=a
5,Event2,2017-01-10T22:00:00Z,
...

( event).

. .

, CSV. CSV (Metadata -> File delimited). File delimited , .

, , , . File delimited . , Java , .. Java . . . , , .

File delimited


. CSV .

.



:

(comma).
Escape Char Settings Text Enclosure. \ .. . , ().
. .
.. . .

Refresh Preview . , .

. . .



CSV . . , . 2017-01-10T22:00:00Z yyyy-MM-dd'T'HH:mm:ss'Z'. . , java , .. Java . .

CSV .

, . tFileInputDelimited.

( ) (tFileInputDelimited ->) , : . .



. O . O (output) . .

, . tFileInputDelimited Property type Repository .

CSV . , . , , .

. . . , . , . . . Build-In Property type. , .

context.INPUT_CSV. , ( ), . . . . ( --context_param INPUT_CSV=path). Java.

. .

tMap tFilterRow. tFilterRow .. . :



tMap tFilterRow . . tMap. Map Editor tMap, .

, () ().



( , , ). . . row1 . .

tFilterRow .

tFilterRow
, . event_name. (==) Hello, world! ( ), Event2.

F. : F(input_column) {comparator} value. F,{comparator} , value Hello, world!. input_column == Hello, world!.

tLogRow, . , Mode tLogRow - , .

tLogRow , tFileOutputDelimited CSV , .

. . :



Connection . Close . , Commit, . Connection.

TOS (subjob). , , . Commit . OnSubjobOk ( Trigger, ). , , ObSubjobError .

CSV .

tag tag2=a. - . , tJavaFlex. tJavaFlex , Java. (), . , . tag tag_name tag_value ( String).



,
row4.tag_name = "";
row4.tag_value = "";

if(row2.tag.contains("="))
{
String[] parts = row2.tag.split("=");
row4.tag_name = parts[0];
row4.tag_value = parts[1];
}

, , , row4.tag_value. tag_value , . . row4 (row2 ). .



, . , tJavaFlex - , . . .

.

, . , Event : , , . . - , . .
.. :

  • -
  • id ,
  • , id ,

Postgres
INSERT
WHERE NOT EXISTS
tPostgresqlRow. SQL . . , ,

String.format("
INSERT INTO tag(tag_name, tag_value)
	SELECT '%s', '%s'
	WHERE NOT EXISTS
	(SELECT * FROM tag
	WHERE tag_name = '%s'
	AND tag_value = '%s');",
input_row.tag_name, input_row.tag_value,
input_row.tag_name, input_row.tag_value)

, ( Java). , Java .

, id .
Postgres , RETURNING id. . . - :

String.format("
WITH T1 AS (
	SELECT * 
	FROM tag
	WHERE tag_name = '%s'
	AND tag_value = '%s'
), T2 AS (
	INSERT INTO tag(tag_name, tag_value)
	SELECT '%s', '%s'
	WHERE NOT EXISTS (SELECT * FROM T1)
	RETURNING tag_id
)
SELECT tag_id FROM T1
UNION ALL
SELECT tag_id FROM T2;",
input_row.tag_name, input_row.tag_value,
input_row.tag_name, input_row.tag_value)

tPostgresqlRow Propagate QUERYs recordset ( Advanced settings), Object, . recordset tParseRecordSet. Prev. Comp. Column list , . , .
:



.. , dbtag_id int tag_id. event tPostgresqlRow tProstgresqlOutput.

:




, . TOS , . , , , . , . . , . tHashInput tHashOutput.

tHashInput tHashOutput
. -> Edit project properties -> -> Palette Settings .

. , , , ( ).

tHashOutput , . , tHashOutput . union sql, .. . , . OnSubjobOk.

, , tHashInput. tHashOutput, , . tHashInput tMap. Main, , , Lookup. , Cross Join.



, , .

, .
Original source: habrahabr.ru (comments, light).

https://habrahabr.ru/post/338352/

:  

: [1] []
 

:
: 

: ( )

:

  URL