Converting complex SQL join to Pandas mergePandas: Join dataframe with conditionWhat is the difference between “INNER JOIN” and “OUTER JOIN”?How to merge two dictionaries in a single expression?Converting string into datetimePython join: why is it string.join(list) instead of list.join(string)?How to join (merge) data frames (inner, outer, left, right)?How do you merge two Git repositories?Merge / convert multiple PDF files into one PDFWhat's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?Renaming columns in pandasSelect rows from a DataFrame based on values in a column in pandas

Terse Method to Swap Lowest for Highest?

How does the math work for Perception checks?

Extract more than nine arguments that occur periodically in a sentence to use in macros in order to typset

Has any country ever had 2 former presidents in jail simultaneously?

How to explain what's wrong with this application of the chain rule?

Why is so much work done on numerical verification of the Riemann Hypothesis?

Can I say "fingers" when referring to toes?

Redundant comparison & "if" before assignment

Is there a way to get `mathscr' with lower case letters in pdfLaTeX?

Can a Canadian Travel to the USA twice, less than 180 days each time?

How to cover method return statement in Apex Class?

Store Credit Card Information in Password Manager?

The probability of Bus A arriving before Bus B

Biological Blimps: Propulsion

Can a stoichiometric mixture of oxygen and methane exist as a liquid at standard pressure and some (low) temperature?

Fear of getting stuck on one programming language / technology that is not used in my country

A social experiment. What is the worst that can happen?

How do you make your own symbol when Detexify fails?

What is the highest possible scrabble score for placing a single tile

Why would a new[] expression ever invoke a destructor?

Keeping a ball lost forever

What does chmod -u do?

Electoral considerations aside, what are potential benefits, for the US, of policy changes proposed by the tweet recognizing Golan annexation?

How can I write humor as character trait?

Converting complex SQL join to Pandas merge

Pandas: Join dataframe with conditionWhat is the difference between “INNER JOIN” and “OUTER JOIN”?How to merge two dictionaries in a single expression?Converting string into datetimePython join: why is it string.join(list) instead of list.join(string)?How to join (merge) data frames (inner, outer, left, right)?How do you merge two Git repositories?Merge / convert multiple PDF files into one PDFWhat's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?Renaming columns in pandasSelect rows from a DataFrame based on values in a column in pandas

I have the following SQL query for finding overlaps between begin and end for a particular note_id:

select a.*, b.*
from test.analytical_cui_mipacq_concepts_new a
inner join test.analytical_cui_mipacq_concepts_new b on ( 
 ( b.begin>=a.begin and b.begin<=a.end )
 or
 ( b.begin<=a.begin and b.end>=a.begin )
)
where ((a.system='metamap' and b.system!=a.system) or (a.system='metamap' and b.system=a.system and a.id_ != b.id_ and a.note_id = b.note_id))

that is taking forever and a day to run. I am trying to follow this thread to convert to a pandas merge:
pandas-join-dataframe-with-condition

and I so far came up with (new is my original dataframe, note_id is how I identify a particular individual, and id_ is the pk from the db table):

a = new.copy()
b = new.copy()
b.columns

b = b.rename(index=str, columns='end':'end_x', 'begin': 'begin_x', 'cui': 'cui_x', 
 'old_cui': 'old_cui_x', 'type': 'type_x', 
 'polarity': 'polarity_x', 'id_':'id_x') 

c = a.merge(b, how='inner', on=['note_id'])

print(len(a), len(b), len(c))
c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 (((c.system=='metamap') & (c.system!=c.system_x)) 
 | ((c.system_x=='metamap') & (c.system==c.system_x) 
 & (c.id_ != c.id_x) & (c.note_id == c.note_id_x)))]

When I run this, I get the following error:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-e8c0d060f2a0> in <module>()
 32 print(len(a), len(b), len(c))
 33 c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
---> 34 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 35 (((c.system=='metamap') & (c.system!=c.system_x)) 
 36 | ((c.system_x=='metamap') & (c.system==c.system_x) 

/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py in wrapper(self, other, axis)
 1674 
 1675 elif isinstance(other, ABCSeries) and not self._indexed_same(other):
-> 1676 raise ValueError("Can only compare identically-labeled "
 1677 "Series objects")
 1678 

ValueError: Can only compare identically-labeled Series objects

Not exactly sure what this means, even after Googling around for it.

The data look like:

begin,polarity,end,note_id,type,system,cui,id_
31,1,37,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004352,1
63,1,71,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,2
81,1,86,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0039869,3
96,1,100,527982345,biomedicus.v2.UmlsConcept,biomedicus,C1123023,4
96,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,5
101,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,6
130,1,138,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,7
143,1,144,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0184661,8
156,1,162,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0026591,9
176,1,185,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004268,10
201,1,209,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,11
101,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168094
100,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168095
109,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168096
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C0205435,168097
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C1279901,168098
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C0574032,168099
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C1827465,168100
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0021966,168101
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0221138,168102
31,1,37,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0004352,55414
599,1,603,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0206655,55415
67,1,73,4069123471-4,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C3263723,55416
646,-1,650,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0042109,55417
31,1,37,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32496
56,1,71,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C0993666,32497
92,1,105,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32498
96,1,100,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32499
120,1,129,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C2008415,32500

edited Mar 8 at 3:43

asked Mar 8 at 2:37

horcle_buzz

6991023

That means the Series a and b have different indexes, and pandas does not define Series comparison in this case. The same error occurs with the test a = pd.Series([1, 2], index=[0, 1]); b = pd.Series([1, 2], index=[0, 2]); a == b. Could you post a few lines of example data?

– Peter Leimbigler
Mar 8 at 2:48

Done. I'm basically trying to find overlaps in my begin and end columns across a single note_id instance..

– horcle_buzz
Mar 8 at 3:11

2

can you post the data not as an image but as actual text so that we can paste it into our IDE's? thanks!

– aws_apprentice
Mar 8 at 3:17

Done. Pasting from excel makes it an image, for some stupid reason.

– horcle_buzz
Mar 8 at 3:25

1

you should probably sample your data given what you provided does not match some of the conditions you specify, such as system == 'metamap'

– aws_apprentice
Mar 8 at 3:27

|
show 1 more comment

I have the following SQL query for finding overlaps between begin and end for a particular note_id:

select a.*, b.*
from test.analytical_cui_mipacq_concepts_new a
inner join test.analytical_cui_mipacq_concepts_new b on ( 
 ( b.begin>=a.begin and b.begin<=a.end )
 or
 ( b.begin<=a.begin and b.end>=a.begin )
)
where ((a.system='metamap' and b.system!=a.system) or (a.system='metamap' and b.system=a.system and a.id_ != b.id_ and a.note_id = b.note_id))

that is taking forever and a day to run. I am trying to follow this thread to convert to a pandas merge:
pandas-join-dataframe-with-condition

and I so far came up with (new is my original dataframe, note_id is how I identify a particular individual, and id_ is the pk from the db table):

a = new.copy()
b = new.copy()
b.columns

b = b.rename(index=str, columns='end':'end_x', 'begin': 'begin_x', 'cui': 'cui_x', 
 'old_cui': 'old_cui_x', 'type': 'type_x', 
 'polarity': 'polarity_x', 'id_':'id_x') 

c = a.merge(b, how='inner', on=['note_id'])

print(len(a), len(b), len(c))
c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 (((c.system=='metamap') & (c.system!=c.system_x)) 
 | ((c.system_x=='metamap') & (c.system==c.system_x) 
 & (c.id_ != c.id_x) & (c.note_id == c.note_id_x)))]

When I run this, I get the following error:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-e8c0d060f2a0> in <module>()
 32 print(len(a), len(b), len(c))
 33 c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
---> 34 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 35 (((c.system=='metamap') & (c.system!=c.system_x)) 
 36 | ((c.system_x=='metamap') & (c.system==c.system_x) 

/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py in wrapper(self, other, axis)
 1674 
 1675 elif isinstance(other, ABCSeries) and not self._indexed_same(other):
-> 1676 raise ValueError("Can only compare identically-labeled "
 1677 "Series objects")
 1678 

ValueError: Can only compare identically-labeled Series objects

Not exactly sure what this means, even after Googling around for it.

The data look like:

begin,polarity,end,note_id,type,system,cui,id_
31,1,37,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004352,1
63,1,71,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,2
81,1,86,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0039869,3
96,1,100,527982345,biomedicus.v2.UmlsConcept,biomedicus,C1123023,4
96,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,5
101,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,6
130,1,138,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,7
143,1,144,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0184661,8
156,1,162,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0026591,9
176,1,185,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004268,10
201,1,209,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,11
101,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168094
100,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168095
109,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168096
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C0205435,168097
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C1279901,168098
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C0574032,168099
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C1827465,168100
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0021966,168101
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0221138,168102
31,1,37,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0004352,55414
599,1,603,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0206655,55415
67,1,73,4069123471-4,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C3263723,55416
646,-1,650,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0042109,55417
31,1,37,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32496
56,1,71,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C0993666,32497
92,1,105,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32498
96,1,100,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32499
120,1,129,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C2008415,32500

edited Mar 8 at 3:43

asked Mar 8 at 2:37

horcle_buzz

6991023

That means the Series a and b have different indexes, and pandas does not define Series comparison in this case. The same error occurs with the test a = pd.Series([1, 2], index=[0, 1]); b = pd.Series([1, 2], index=[0, 2]); a == b. Could you post a few lines of example data?

– Peter Leimbigler
Mar 8 at 2:48

Done. I'm basically trying to find overlaps in my begin and end columns across a single note_id instance..

– horcle_buzz
Mar 8 at 3:11

2

can you post the data not as an image but as actual text so that we can paste it into our IDE's? thanks!

– aws_apprentice
Mar 8 at 3:17

Done. Pasting from excel makes it an image, for some stupid reason.

– horcle_buzz
Mar 8 at 3:25

1

you should probably sample your data given what you provided does not match some of the conditions you specify, such as system == 'metamap'

– aws_apprentice
Mar 8 at 3:27

|
show 1 more comment

I have the following SQL query for finding overlaps between begin and end for a particular note_id:

select a.*, b.*
from test.analytical_cui_mipacq_concepts_new a
inner join test.analytical_cui_mipacq_concepts_new b on ( 
 ( b.begin>=a.begin and b.begin<=a.end )
 or
 ( b.begin<=a.begin and b.end>=a.begin )
)
where ((a.system='metamap' and b.system!=a.system) or (a.system='metamap' and b.system=a.system and a.id_ != b.id_ and a.note_id = b.note_id))

that is taking forever and a day to run. I am trying to follow this thread to convert to a pandas merge:
pandas-join-dataframe-with-condition

and I so far came up with (new is my original dataframe, note_id is how I identify a particular individual, and id_ is the pk from the db table):

a = new.copy()
b = new.copy()
b.columns

b = b.rename(index=str, columns='end':'end_x', 'begin': 'begin_x', 'cui': 'cui_x', 
 'old_cui': 'old_cui_x', 'type': 'type_x', 
 'polarity': 'polarity_x', 'id_':'id_x') 

c = a.merge(b, how='inner', on=['note_id'])

print(len(a), len(b), len(c))
c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 (((c.system=='metamap') & (c.system!=c.system_x)) 
 | ((c.system_x=='metamap') & (c.system==c.system_x) 
 & (c.id_ != c.id_x) & (c.note_id == c.note_id_x)))]

When I run this, I get the following error:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-e8c0d060f2a0> in <module>()
 32 print(len(a), len(b), len(c))
 33 c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
---> 34 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 35 (((c.system=='metamap') & (c.system!=c.system_x)) 
 36 | ((c.system_x=='metamap') & (c.system==c.system_x) 

/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py in wrapper(self, other, axis)
 1674 
 1675 elif isinstance(other, ABCSeries) and not self._indexed_same(other):
-> 1676 raise ValueError("Can only compare identically-labeled "
 1677 "Series objects")
 1678 

ValueError: Can only compare identically-labeled Series objects

Not exactly sure what this means, even after Googling around for it.

The data look like:

begin,polarity,end,note_id,type,system,cui,id_
31,1,37,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004352,1
63,1,71,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,2
81,1,86,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0039869,3
96,1,100,527982345,biomedicus.v2.UmlsConcept,biomedicus,C1123023,4
96,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,5
101,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,6
130,1,138,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,7
143,1,144,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0184661,8
156,1,162,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0026591,9
176,1,185,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004268,10
201,1,209,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,11
101,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168094
100,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168095
109,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168096
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C0205435,168097
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C1279901,168098
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C0574032,168099
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C1827465,168100
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0021966,168101
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0221138,168102
31,1,37,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0004352,55414
599,1,603,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0206655,55415
67,1,73,4069123471-4,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C3263723,55416
646,-1,650,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0042109,55417
31,1,37,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32496
56,1,71,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C0993666,32497
92,1,105,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32498
96,1,100,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32499
120,1,129,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C2008415,32500

edited Mar 8 at 3:43

asked Mar 8 at 2:37

horcle_buzz

6991023

I have the following SQL query for finding overlaps between begin and end for a particular note_id:

select a.*, b.*
from test.analytical_cui_mipacq_concepts_new a
inner join test.analytical_cui_mipacq_concepts_new b on ( 
 ( b.begin>=a.begin and b.begin<=a.end )
 or
 ( b.begin<=a.begin and b.end>=a.begin )
)
where ((a.system='metamap' and b.system!=a.system) or (a.system='metamap' and b.system=a.system and a.id_ != b.id_ and a.note_id = b.note_id))

that is taking forever and a day to run. I am trying to follow this thread to convert to a pandas merge:
pandas-join-dataframe-with-condition

and I so far came up with (new is my original dataframe, note_id is how I identify a particular individual, and id_ is the pk from the db table):

a = new.copy()
b = new.copy()
b.columns

b = b.rename(index=str, columns='end':'end_x', 'begin': 'begin_x', 'cui': 'cui_x', 
 'old_cui': 'old_cui_x', 'type': 'type_x', 
 'polarity': 'polarity_x', 'id_':'id_x') 

c = a.merge(b, how='inner', on=['note_id'])

print(len(a), len(b), len(c))
c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 (((c.system=='metamap') & (c.system!=c.system_x)) 
 | ((c.system_x=='metamap') & (c.system==c.system_x) 
 & (c.id_ != c.id_x) & (c.note_id == c.note_id_x)))]

When I run this, I get the following error:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-e8c0d060f2a0> in <module>()
 32 print(len(a), len(b), len(c))
 33 c.loc[(((c.begin >= c.begin_x) & (c.begin <= c.end_x)) 
---> 34 | ((c.begin<=b.begin_x) & (c.end>=c.begin_x))) &
 35 (((c.system=='metamap') & (c.system!=c.system_x)) 
 36 | ((c.system_x=='metamap') & (c.system==c.system_x) 

/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py in wrapper(self, other, axis)
 1674 
 1675 elif isinstance(other, ABCSeries) and not self._indexed_same(other):
-> 1676 raise ValueError("Can only compare identically-labeled "
 1677 "Series objects")
 1678 

ValueError: Can only compare identically-labeled Series objects

Not exactly sure what this means, even after Googling around for it.

The data look like:

begin,polarity,end,note_id,type,system,cui,id_
31,1,37,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004352,1
63,1,71,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,2
81,1,86,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0039869,3
96,1,100,527982345,biomedicus.v2.UmlsConcept,biomedicus,C1123023,4
96,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,5
101,1,105,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0015230,6
130,1,138,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,7
143,1,144,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0184661,8
156,1,162,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0026591,9
176,1,185,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0004268,10
201,1,209,527982345,biomedicus.v2.UmlsConcept,biomedicus,C0574032,11
101,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168094
100,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168095
109,-1,116,527982345,org.metamap.uima.ts.Candidate,metamap,C0445223,168096
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C0205435,168097
124,1,129,527982345,org.metamap.uima.ts.Candidate,metamap,C1279901,168098
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C0574032,168099
130,1,138,527982345,org.metamap.uima.ts.Candidate,metamap,C1827465,168100
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0021966,168101
143,1,144,527982345,org.metamap.uima.ts.Candidate,metamap,C0221138,168102
31,1,37,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0004352,55414
599,1,603,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0206655,55415
67,1,73,4069123471-4,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C3263723,55416
646,-1,650,527982345,org.apache.ctakes.typesystem.type.textsem.DiseaseDisorderMention,ctakes,C0042109,55417
31,1,37,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32496
56,1,71,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C0993666,32497
92,1,105,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32498
96,1,100,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,,32499
120,1,129,527982345,edu.uth.clamp.nlp.typesystem.ClampNameEntityUIMA,clamp,C2008415,32500

python pandas join merge

edited Mar 8 at 3:43

asked Mar 8 at 2:37

horcle_buzz

6991023

edited Mar 8 at 3:43

asked Mar 8 at 2:37

horcle_buzz

6991023

edited Mar 8 at 3:43

asked Mar 8 at 2:37

horcle_buzz

6991023

asked Mar 8 at 2:37

horcle_buzz

6991023

asked Mar 8 at 2:37

horcle_buzz

6991023

That means the Series a and b have different indexes, and pandas does not define Series comparison in this case. The same error occurs with the test a = pd.Series([1, 2], index=[0, 1]); b = pd.Series([1, 2], index=[0, 2]); a == b. Could you post a few lines of example data?

– Peter Leimbigler
Mar 8 at 2:48

Done. I'm basically trying to find overlaps in my begin and end columns across a single note_id instance..

– horcle_buzz
Mar 8 at 3:11

2

can you post the data not as an image but as actual text so that we can paste it into our IDE's? thanks!

– aws_apprentice
Mar 8 at 3:17

Done. Pasting from excel makes it an image, for some stupid reason.

– horcle_buzz
Mar 8 at 3:25

1

you should probably sample your data given what you provided does not match some of the conditions you specify, such as system == 'metamap'

– aws_apprentice
Mar 8 at 3:27

|
show 1 more comment

That means the Series a and b have different indexes, and pandas does not define Series comparison in this case. The same error occurs with the test a = pd.Series([1, 2], index=[0, 1]); b = pd.Series([1, 2], index=[0, 2]); a == b. Could you post a few lines of example data?

– Peter Leimbigler
Mar 8 at 2:48

Done. I'm basically trying to find overlaps in my begin and end columns across a single note_id instance..

– horcle_buzz
Mar 8 at 3:11

2

can you post the data not as an image but as actual text so that we can paste it into our IDE's? thanks!

– aws_apprentice
Mar 8 at 3:17

Done. Pasting from excel makes it an image, for some stupid reason.

– horcle_buzz
Mar 8 at 3:25

1

you should probably sample your data given what you provided does not match some of the conditions you specify, such as system == 'metamap'

– aws_apprentice
Mar 8 at 3:27

That means the Series a and b have different indexes, and pandas does not define Series comparison in this case. The same error occurs with the test a = pd.Series([1, 2], index=[0, 1]); b = pd.Series([1, 2], index=[0, 2]); a == b. Could you post a few lines of example data?

– Peter Leimbigler
Mar 8 at 2:48

Done. I'm basically trying to find overlaps in my begin and end columns across a single note_id instance..

– horcle_buzz
Mar 8 at 3:11

can you post the data not as an image but as actual text so that we can paste it into our IDE's? thanks!

– aws_apprentice
Mar 8 at 3:17

Done. Pasting from excel makes it an image, for some stupid reason.

– horcle_buzz
Mar 8 at 3:25

you should probably sample your data given what you provided does not match some of the conditions you specify, such as system == 'metamap'

– aws_apprentice
Mar 8 at 3:27

|
show 1 more comment

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055931%2fconverting-complex-sql-join-to-pandas-merge%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggtcf

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme