Thursday 24 January 2019

linux - How to separately sort lines within multiple "chunks" separated with headers?


This question is complementary to this one: Sort packs of lines alphabetically. After answering there, it turned out I had totally misunderstood the question and solved another problem. Not wanting my solution to be forgotten, I'm posting the problem here (and my solution below).




Consider a text like:


[ProfileB]
param3=z
param2=y
param1=x
[ProfileA]
param1=k
param3=l
param2=

I need to sort parameters within every [Profile*] block separately. The above example should be sorted to this:


[ProfileB]
param1=x
param2=y
param3=z
[ProfileA]
param1=k
param2=
param3=l

How can I do it with standard Unix/Linux tools?



Answer



This works in my Debian:


sed '1 ! s/^\[/\x00\[/g' |
split -t '\0' -l 1 --filter='
tr -d "\0" |
{ IFS="" read -r; printf "%s\n" "$REPLY"; sort; }
'

To work with file(s) use redirection(s), e.g. { sed … ; } output.txt, where sed … is the whole command.


The procedure is as follows:



  1. sed inserts null character before every [ that is in the beginning of a line, unless the line is the first one. This way null characters separate profiles.

  2. split generates chunks, taking records separated by null characters, one record per chunk. Instead of writing to files, split calls a filter for each chunk separately:

    1. at first tr deletes null characters;

    2. then read and printf just echo the first line (header) of the chunk;

    3. at last sort does its job with remaining lines.



  3. Chunks are processed sequentially; the output is a single concatenated stream.


No comments:

Post a Comment

Where does Skype save my contact's avatars in Linux?

I'm using Skype on Linux. Where can I find images cached by skype of my contact's avatars? Answer I wanted to get those Skype avat...