--Comparing Text Strings--

Discover Nepali Service Providers in your area

[Show all top banners]

Saajha

Replies to this thread:

techGuy 6169 days ago

Saajha 6169 days ago

parajn 6169 days ago

Saajha 6165 days ago

gidilat 6165 days ago

Saajha 6165 days ago

gidilat 6165 days ago

Saajha 6164 days ago

MORE RELATED DISCUSSIONS

More by Saajha

Google founders make $2 billion each in an hour...

-- A Ghazal in a Mehfil ---

Motorcycle Group Ride: NY/NJ

Calgary/Banff

bond_007

**Watch Out 'India Times' Website Visitors**

Motorcycle Riders around?

-- NYC Power Outage --

Pittsburgh Steelers in Superbowl today

Senior Network Engineer and Voice Engineer

What's on your bookshelf?

Sysadmins- Advise pls.

Cisco Voice Engineer

SQL HELP!

Soooo Happy - Finally I found... II

Let's Talk Dogs?

%% Western Oldies %%

Happy Birthday Village Boy... ...

DASHAIN DHOON/ MANGAL DHOON

Sequel to 'So Happy Finally I found..': Nostalgia Continues

See more by Saajha

What people are reading

Visitor is reading name

Visitor is reading --Comparing Text Strings--

Visitor is reading उहालाई त मारिसकेछन् पापीहरूले

Visitor is reading Form I-601, Application for Waiver of Grounds of Inadmissibi

Visitor is reading Demanding Bribe at Corrupt Airport in Kathmandu, Nepal

Visitor from US is reading DV 2010 winner

Visitor is reading I-94 Questions

Visitor is reading Some time SAJHA seems slow ?? Kina hola ??

Visitor is reading Random thoughts

Subscribers

[Total Subscribers 1]

Slackdemic

:: Subscribe

View Members

Back to: Computer/IT

Refresh page to view new replies

--Comparing Text Strings--

[VIEWED 14892 TIMES]

SAVE! for ease of future access.

Saajha

Posted on 08-14-09 2:13 PM Reply [Subscribe]

I have two text files to compare: file A and file B

file A is nicely formatted with sections, headings etc, for better visibility

file B is a linear raw list of strings (really a bunch of machine names)

I am trying to compare file A and file B, and locate the strings in each file that don't exist in the other, and vice versa-- in other words, identify unique strings in each file.

UNIX utility *diff* works great, so do Windows tools like 'ExamDiff', 'CompareIt!' etc; but they only compare a single occurence of each string pair, and ignore the rests.

For instance, I have

List A       List B
------       ------
abc          bcd
def          def
def          ijk
ghi           jkl

The result will be:

List A       List B
------        ------
abc           bcd
def           ijk
ghi            jkl

(Note that the eliminated strings were the ones that followed One-to-One matching)

While the expected result is:

List A       List B
------      ------
abc       bcd
ghi          ijk
              jkl

With both occurences of 'def' being eliminated - with One-to-many comparison.

Can anyone suggest a solution? A tool or an script logic?

~@~

View/Share this post only

techGuy

Posted on 08-14-09 4:37 PM Reply [Subscribe]

try http://www.scootersoftware.com/download.php

Last edited: 14-Aug-09 04:42 PM

View/Share this post only

Saajha

Posted on 08-14-09 4:59 PM Reply [Subscribe]

Just installed and tried it -- still the same issue ..it does the comparison, but only for a single occurrence. I haven't had a chance to look at the options yet though.

Thanks!

~@~

View/Share this post only

parajn

Posted on 08-14-09 7:07 PM Reply [Subscribe]

Download the 30 day free trail of arexis merge. This tool works great.
http://www.araxis.com/merge/

View/Share this post only

Saajha

Posted on 08-18-09 12:19 PM Reply [Subscribe]

Have you tried this tool for similar purpose?
Will take a look. Thanks!

BTW, I was able to get it done on Excel (underrated, but worked great) by playing around with logical comparison formulas.

~@~

View/Share this post only

gidilat

Posted on 08-18-09 1:34 PM Reply [Subscribe]

Create two text files.
x.txt with
abc
def
def
ghi

and y.txt with
bcd
def
ijk
jkl

Issue following commands (at cygwin prompt)
//sort and copy unique elements of x to x1
sort -u x.txt > x1.txt
//sort and copy unique elements of y to y1
sort -u y.txt > y1.txt

//copy lines that appear in the both x1 and y1 to z
comm -1 -2 x1.txt y1.txt >z.txt

//output lines that appear in x1 only
comm -2 -3 x1.txt z.txt

//output lines that appear in y1 only
comm -2 -3 y1.txt z.txt

Works for this limited dataset.
Give it a try.

View/Share this post only

Saajha

Posted on 08-18-09 4:37 PM Reply [Subscribe]

Thanks @gidilat, for the advice, and welcome to the forum - if you are a new member :-)

Just looked through the commands and spotted a minor gotcha..

When you do sort -u and sort the list in order by unique elements, you'd actually get rid of multiple occurrences of each element. Once that's done, it's really a one to one comparison, no?

In the above scenario, ALL occurrences of def on list A were eliminated - as they matched def on list B.
But if I had, say abc listed twice in list A and did not exist at all in list B, sort -u would remove the second instance of abc in list A, correct? But the goal is to have every single occurrence of abc in list A if it's not present in list B.

Thanks again for chiming in.
~@~

View/Share this post only

gidilat

Posted on 08-18-09 7:02 PM Reply [Subscribe]

You are changing the goal from
"in other words, identify unique strings in each file"
to
"goal is to have every single occurrence of abc in list A if it's not present in list'

A problem needs to defined properly before it can be solved.

View/Share this post only

Saajha

Posted on 08-19-09 11:08 AM Reply [Subscribe]

Hmm, so - I said:
"...the goal is to have every single occurrence of abc in list A if it's not present in list B"

Doesn't that leave each list's strings/elements unique to those of the other list (and hence "...unique strings in each file")?

When we compare two files and speak about 'uniqueness', I'd think the reference would be toward uniqueness with respect to each other, and not within oneself.

Regardless, I'll buy the fact that 'unique strings in each file with respect to each other' or something similar would've made it little more descriptive. Sorry -- Thanks!

So, just for the heck of it - I tried comm without sorting and isolating the unique elements, works just as good as any other alternatives (except Excel) that I tried in the past. It does the comparison, but only gets rid of a single occurrence of the match.

~@~

View/Share this post only

Please Log in! to be able to reply! If you don't have a login, please register here.

YOU CAN ALSO

IN ORDER TO POST!

Within last 200 days

Recommended Popular Threads

Controvertial Threads

ChatSansar.com Naya Nepal Chat

ए १ पनि पुगेनछ ?

Supporting issues on principle instead of blind brainwashed support

Democrat wants to run election like in India. Chaos and Confusing to voters.

त्रुम्प्ले आफैले आफुलाई क्षमादान गर्ने भो, It may be the first and last Self-Pardon.

नोबेल शान्ति पुरस्कार र अशान्त राष्ट्रपतिको बालहठ

Cricketer Sandeep Lamichhane singing & playing guitar

200 denaturalization cases per month to the Department of Justice for the 2026 fiscal year.

नेपाली वालमार्ट चोर

मानसिक सन्तुलन, एक कहालीलाग्दो घटना सिक्नुपर्ने कुराहरु

Breaking News: Ninth Circuit Rejects Government Bid to Undo Nepal TPS Order, Leaves Protections in Place

Why Oli must go for the UML to survive?

मिरो प्रेडिक्शन जन्मेर एमेरिकामा आखा खो ल न पायेका नागरिकता बारे

Funny when Nepalis talk about Epstein and injustice

अब के हाेला त?

आरको नेपालीले बेजत गर्यो फेरी अरविंग टेक्सासमा बुढ़ाहरू लाई स्कॉम गरेर

How many ministries do we really need?

Deepak Bhatta could help bring down the three stooges

maga got what they voted for

where are Hunter Biden and Biden got 40k maga hounds?

NOTE: The opinions here represent the opinions of the individual posters, and not of Sajha.com. It is not possible for sajha.com to monitor all the postings, since sajha.com merely seeks to provide a cyber location for discussing ideas and concerns related to Nepal and the Nepalis. Please send an email to admin@sajha.com using a valid email address if you want any posting to be considered for deletion. Your request will be handled on a one to one basis. Sajha.com is a service please don't abuse it. - Thanks.

Sajha.com Privacy Policy

↑ Back to Top

Like us in Facebook!