Split SQL statements on function name but keep delimiter in Python2019 Community Moderator ElectionParse CASE WHEN statements with sqlparseCalling a function of a module by using its name (a string)Replacements for switch statement in Python?What is the naming convention in Python for variable and function names?How do I split a string with any whitespace chars as delimiters?How do I split a string on a delimiter in Bash?Split Strings into words with multiple word boundary delimitersHow can I do an UPDATE statement with JOIN in SQL?Find all tables containing column with specified name - MS SQL ServerSplit string with multiple delimiters in PythonSplit string on whitespace in Python

How can I discourage/prevent PCs from using door choke-points?

Do Bugbears' arms literally get longer when it's their turn?

Is it ok to include an epilogue dedicated to colleagues who passed away in the end of the manuscript?

How do anti-virus programs start at Windows boot?

How to deal with a cynical class?

Is this animal really missing?

Making a sword in the stone, in a medieval world without magic

How is the Swiss post e-voting system supposed to work, and how was it wrong?

Deleting missing values from a dataset

Coworker uses her breast-pump everywhere in the office

When is a batch class instantiated when you schedule it?

Why do Australian milk farmers need to protest supermarkets' milk price?

Ban on all campaign finance?

What does it mean when multiple 々 marks follow a 、?

Is all copper pipe pretty much the same?

What is the definition of "Natural Selection"?

Running a subshell from the middle of the current command

Why don't MCU characters ever seem to have language issues?

What exactly is the purpose of connection links straped between the rocket and the launch pad

Do I need to leave some extra space available on the disk which my database log files reside, for log backup operations to successfully occur?

Who is our nearest neighbor

How does Dispel Magic work against Stoneskin?

What happens with multiple copies of Humility and Glorious Anthem on the battlefield?

It's a yearly task, alright

Split SQL statements on function name but keep delimiter in Python

2019 Community Moderator ElectionParse CASE WHEN statements with sqlparseCalling a function of a module by using its name (a string)Replacements for switch statement in Python?What is the naming convention in Python for variable and function names?How do I split a string with any whitespace chars as delimiters?How do I split a string on a delimiter in Bash?Split Strings into words with multiple word boundary delimitersHow can I do an UPDATE statement with JOIN in SQL?Find all tables containing column with specified name - MS SQL ServerSplit string with multiple delimiters in PythonSplit string on whitespace in Python

Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);

 SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
 MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?

So the desired output would be a string like the one below:

 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

asked Mar 7 at 11:01

Old-School

155

New contributor

Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05

@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07

Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09

@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11

add a comment |

Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);

 SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
 MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?

So the desired output would be a string like the one below:

 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

asked Mar 7 at 11:01

Old-School

155

New contributor

Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05

@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07

Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09

@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11

add a comment |

Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);

 SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
 MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?

So the desired output would be a string like the one below:

 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

asked Mar 7 at 11:01

Old-School

155

New contributor

Assuming that I have the following string that contains SQL statements extracted from a SELECT clause (in reality this is a huge SQL statement with hundreds of such statements);

 SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
 MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

how can I split this query on function (i.e. SUM, MAX, MIN, MEAN etc.) such that I can extract the last query but without removing the delimiter (which in this case is SUM)?

So the desired output would be a string like the one below:

 SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

PS: For presentation purposes I have provided some sort of indentation but in reality these statements are separated by a comma meaning that no whitespaces or new lines appear in the original form.

python sql regex split

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

asked Mar 7 at 11:01

Old-School

155

New contributor

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

asked Mar 7 at 11:01

Old-School

155

New contributor

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

edited Mar 7 at 12:27

Martijn Pieters♦

719k14025112320

asked Mar 7 at 11:01

Old-School

155

New contributor

asked Mar 7 at 11:01

Old-School

155

asked Mar 7 at 11:01

Old-School

155

New contributor

Old-School is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05

@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07

Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09

@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11

add a comment |

Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05

@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07

Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09

@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11

Have you tried to split by comma (,) ?

– Ralf
Mar 7 at 11:05

@Ralf It won't work in this scenario. A split on , (sql.split(',').pop()) would give "C")) then 2 end) as finalCond2

– Old-School
Mar 7 at 11:07

Hm... and .split(',n') ?

– Ralf
Mar 7 at 11:09

@Ralf Won't work either. For presentation purposes I have provided some sort of indentation but in reality these statements are just separated by a comma meaning that no whitespaces or new lines appear in the original form.

– Old-School
Mar 7 at 11:11

add a comment |

3 Answers
3

active

oldest

votes

You can't use a regular expression here, because SQL syntax does not form regular patterns you could match with the Python re engine. You'd have to actually parse the string into a token stream or syntax tree; your SUM(...) can contain a wide array of syntax, including sub-selects, after all.

The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.

Re-using the walk_tokens function I defined in the other post I linked to:

from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
 queue = deque([token])
 while queue:
 token = queue.popleft()
 if isinstance(token, TokenList):
 queue.extend(token)
 yield token

extracting the last element from the SELECT identifier list then is:

import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
 if isinstance(tok, IdentifierList):
 # iterate to leave the last assigned to `identifier`
 for identifier in tok.get_identifiers():
 pass
 break

print(identifier)

Demo:

>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.

edited Mar 7 at 11:57

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35

@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters♦
Mar 7 at 11:36

@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters♦
Mar 7 at 11:37

Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40

@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters♦
Mar 7 at 11:55

|
show 1 more comment

I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.

s = """
SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2 
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
 s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
 new_parts = []
 for part in current_parts:
 for i, new_part in enumerate(part.split(kw)):
 if i > 0:
 # add keyword to the start of this substring
 new_part = ''.format(kw, new_part)

 new_part = new_part.strip()
 if len(new_part) > 0:
 new_parts.append(new_part.strip())

 current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
 print(s)

The output I get is:

current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2

Does it work for you? It seems to work for the example string you put in the question.

edited Mar 7 at 11:31

answered Mar 7 at 11:27

Ralf

6,70141337

But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28

The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29

Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30

add a comment |

You could use something like:

import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
 end = reg.end()
 if(len(str) > end and str[end] == ','):
 commas.append(end)

idx = 0
for comma in commas:
 parts.append(str[idx:comma])
 idx = comma + 1
parts.append(str[idx:])

print(parts)

In commas array you will have the commas that need to be splitted. Output will be:

[151, 322]

In parts you'll have the final array with the parts (Not sure if this implementation is the best way):

[
 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
 ' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
 ' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]

edited Mar 7 at 11:57

answered Mar 7 at 11:45

ALFA

543212

Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47

Edited for you, check now!

– ALFA
Mar 7 at 11:58

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Old-School is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042234%2fsplit-sql-statements-on-function-name-but-keep-delimiter-in-python%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.

Re-using the walk_tokens function I defined in the other post I linked to:

from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
 queue = deque([token])
 while queue:
 token = queue.popleft()
 if isinstance(token, TokenList):
 queue.extend(token)
 yield token

extracting the last element from the SELECT identifier list then is:

import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
 if isinstance(tok, IdentifierList):
 # iterate to leave the last assigned to `identifier`
 for identifier in tok.get_identifiers():
 pass
 break

print(identifier)

Demo:

>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.

edited Mar 7 at 11:57

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35

@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters♦
Mar 7 at 11:36

@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters♦
Mar 7 at 11:37

Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40

@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters♦
Mar 7 at 11:55

|
show 1 more comment

The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.

Re-using the walk_tokens function I defined in the other post I linked to:

from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
 queue = deque([token])
 while queue:
 token = queue.popleft()
 if isinstance(token, TokenList):
 queue.extend(token)
 yield token

extracting the last element from the SELECT identifier list then is:

import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
 if isinstance(tok, IdentifierList):
 # iterate to leave the last assigned to `identifier`
 for identifier in tok.get_identifiers():
 pass
 break

print(identifier)

Demo:

>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.

edited Mar 7 at 11:57

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35

@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters♦
Mar 7 at 11:36

@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters♦
Mar 7 at 11:37

Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40

@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters♦
Mar 7 at 11:55

|
show 1 more comment

The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.

Re-using the walk_tokens function I defined in the other post I linked to:

from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
 queue = deque([token])
 while queue:
 token = queue.popleft()
 if isinstance(token, TokenList):
 queue.extend(token)
 yield token

extracting the last element from the SELECT identifier list then is:

import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
 if isinstance(tok, IdentifierList):
 # iterate to leave the last assigned to `identifier`
 for identifier in tok.get_identifiers():
 pass
 break

print(identifier)

Demo:

>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.

edited Mar 7 at 11:57

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

The sqlparse library can do this, even though it is a bit underdocumented and not that friendly to external uses.

Re-using the walk_tokens function I defined in the other post I linked to:

from collections import deque
from sqlparse.sql import TokenList

def walk_tokens(token):
 queue = deque([token])
 while queue:
 token = queue.popleft()
 if isinstance(token, TokenList):
 queue.extend(token)
 yield token

extracting the last element from the SELECT identifier list then is:

import sqlparse
from sqlparse.sql import IdentifierList

tokens = sqlparse.parse(sql)[0]
for tok in walk_tokens(tokens):
 if isinstance(tok, IdentifierList):
 # iterate to leave the last assigned to `identifier`
 for identifier in tok.get_identifiers():
 pass
 break

print(identifier)

Demo:

>>> sql = '''
... SUM(case when(A.money-B.money>1000
... and A.unixtime-B.unixtime<=890769
... and B.col10 = "A"
... and B.col11 = "12"
... and B.col12 = "V") then 10
... end) as finalCond0,
... MAX(case when(A.money-B.money<0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "4321"
... and B.cond3 in ("E", "F", "G")) then A.col10
... end) as finalCond1,
... SUM(case when(A.money-B.money>0
... and A.unixtime-B.unixtime<=6786000
... and B.cond1 = "A"
... and B.cond2 = "1234"
... and B.cond3 in ("A", "B", "C")) then 2
... end) as finalCond2
... '''
>>> tokens = sqlparse.parse(sql)[0]
>>> for tok in walk_tokens(tokens):
... if isinstance(tok, IdentifierList):
... # iterate to leave the last assigned to `identifier`
... for identifier in tok.get_identifiers():
... pass
... break
...
>>> print(identifier)
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2

identifier is a sqlparse.sql.Identifier instance, but converting it to a string again (which print() does, or you can just use str()) gives you the input SQL string again for that section.

edited Mar 7 at 11:57

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

edited Mar 7 at 11:57

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

answered Mar 7 at 11:34

Martijn Pieters♦

719k14025112320

Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35

@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters♦
Mar 7 at 11:36

@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters♦
Mar 7 at 11:37

Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40

@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters♦
Mar 7 at 11:55

|
show 1 more comment

Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35

@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters♦
Mar 7 at 11:36

@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters♦
Mar 7 at 11:37

Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40

@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters♦
Mar 7 at 11:55

Wouldn't be much easier with a regular expression?

– Old-School
Mar 7 at 11:35

@Old-School: no, because Python's regular expressions can't be used to parse nested structures.

– Martijn Pieters♦
Mar 7 at 11:36

@Old-School: SQL is not 'regular', you can't predict the number of parentheses, or where commas are going to used, etc.

– Martijn Pieters♦
Mar 7 at 11:37

Would this solution work even if I don't have complete SQL statements? Meaning that they are just identifiers from a SELECT statement and not the actual statement per se.

– Old-School
Mar 7 at 11:40

@Old-School: I used your literal input, it's in the demonstration. No SELECT or FROM or JOIN in sight.

– Martijn Pieters♦
Mar 7 at 11:55

|
show 1 more comment

I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.

s = """
SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2 
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
 s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
 new_parts = []
 for part in current_parts:
 for i, new_part in enumerate(part.split(kw)):
 if i > 0:
 # add keyword to the start of this substring
 new_part = ''.format(kw, new_part)

 new_part = new_part.strip()
 if len(new_part) > 0:
 new_parts.append(new_part.strip())

 current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
 print(s)

The output I get is:

current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2

Does it work for you? It seems to work for the example string you put in the question.

edited Mar 7 at 11:31

answered Mar 7 at 11:27

Ralf

6,70141337

But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28

The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29

Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30

add a comment |

I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.

s = """
SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2 
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
 s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
 new_parts = []
 for part in current_parts:
 for i, new_part in enumerate(part.split(kw)):
 if i > 0:
 # add keyword to the start of this substring
 new_part = ''.format(kw, new_part)

 new_part = new_part.strip()
 if len(new_part) > 0:
 new_parts.append(new_part.strip())

 current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
 print(s)

The output I get is:

current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2

Does it work for you? It seems to work for the example string you put in the question.

edited Mar 7 at 11:31

answered Mar 7 at 11:27

Ralf

6,70141337

But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28

The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29

Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30

add a comment |

I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.

s = """
SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2 
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
 s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
 new_parts = []
 for part in current_parts:
 for i, new_part in enumerate(part.split(kw)):
 if i > 0:
 # add keyword to the start of this substring
 new_part = ''.format(kw, new_part)

 new_part = new_part.strip()
 if len(new_part) > 0:
 new_parts.append(new_part.strip())

 current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
 print(s)

The output I get is:

current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2

Does it work for you? It seems to work for the example string you put in the question.

edited Mar 7 at 11:31

answered Mar 7 at 11:27

Ralf

6,70141337

I have a solution, but it is a bit much code. This is without using regex, just many iterations of splitting on keywords.

s = """
SUM(case when(A.money-B.money>1000
 and A.unixtime-B.unixtime<=890769
 and B.col10 = "A"
 and B.col11 = "12"
 and B.col12 = "V") then 10
 end) as finalCond0,
MAX(case when(A.money-B.money<0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "4321"
 and B.cond3 in ("E", "F", "G")) then A.col10
 end) as finalCond1,
SUM(case when(A.money-B.money>0
 and A.unixtime-B.unixtime<=6786000
 and B.cond1 = "A"
 and B.cond2 = "1234"
 and B.cond3 in ("A", "B", "C")) then 2
 end) as finalCond2 
"""

# remove newlines and doble spaces
s = s.replace('n', ' ')
while ' ' in s:
 s = s.replace(' ', ' ')
s = s.strip()

# split on keywords, starting with the original string
current_parts = [s, ]
for kw in ['SUM', 'MAX', 'MIN']:
 new_parts = []
 for part in current_parts:
 for i, new_part in enumerate(part.split(kw)):
 if i > 0:
 # add keyword to the start of this substring
 new_part = ''.format(kw, new_part)

 new_part = new_part.strip()
 if len(new_part) > 0:
 new_parts.append(new_part.strip())

 current_parts = new_parts

print()
print('current_parts:')
for s in current_parts:
 print(s)

The output I get is:

current_parts:
SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0,
MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1,
SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2

Does it work for you? It seems to work for the example string you put in the question.

edited Mar 7 at 11:31

answered Mar 7 at 11:27

Ralf

6,70141337

edited Mar 7 at 11:31

answered Mar 7 at 11:27

Ralf

6,70141337

answered Mar 7 at 11:27

Ralf

6,70141337

answered Mar 7 at 11:27

Ralf

6,70141337

But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28

The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29

Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30

add a comment |

But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28

The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29

Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30

But I have already told you that there are not new lines (n) in the string.

– Old-School
Mar 7 at 11:28

The code I posted will work, wheter there are newlines in the string or not. I just make sure that there are no newlines present at the start, but that wont affect the outcome if the string does not have any newlines in it.

– Ralf
Mar 7 at 11:29

Right OK. Thanks for the attempt.

– Old-School
Mar 7 at 11:30

add a comment |

You could use something like:

import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
 end = reg.end()
 if(len(str) > end and str[end] == ','):
 commas.append(end)

idx = 0
for comma in commas:
 parts.append(str[idx:comma])
 idx = comma + 1
parts.append(str[idx:])

print(parts)

In commas array you will have the commas that need to be splitted. Output will be:

[151, 322]

In parts you'll have the final array with the parts (Not sure if this implementation is the best way):

[
 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
 ' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
 ' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]

edited Mar 7 at 11:57

answered Mar 7 at 11:45

ALFA

543212

Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47

Edited for you, check now!

– ALFA
Mar 7 at 11:58

add a comment |

You could use something like:

import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
 end = reg.end()
 if(len(str) > end and str[end] == ','):
 commas.append(end)

idx = 0
for comma in commas:
 parts.append(str[idx:comma])
 idx = comma + 1
parts.append(str[idx:])

print(parts)

In commas array you will have the commas that need to be splitted. Output will be:

[151, 322]

In parts you'll have the final array with the parts (Not sure if this implementation is the best way):

[
 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
 ' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
 ' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]

edited Mar 7 at 11:57

answered Mar 7 at 11:45

ALFA

543212

Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47

Edited for you, check now!

– ALFA
Mar 7 at 11:58

add a comment |

You could use something like:

import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
 end = reg.end()
 if(len(str) > end and str[end] == ','):
 commas.append(end)

idx = 0
for comma in commas:
 parts.append(str[idx:comma])
 idx = comma + 1
parts.append(str[idx:])

print(parts)

In commas array you will have the commas that need to be splitted. Output will be:

[151, 322]

In parts you'll have the final array with the parts (Not sure if this implementation is the best way):

[
 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
 ' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
 ' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]

edited Mar 7 at 11:57

answered Mar 7 at 11:45

ALFA

543212

You could use something like:

import re

str = 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0, MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1, SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'

result = re.finditer('ass+[a-zA-Z0-9]+', str);

commas = []
parts = []

for reg in result:
 end = reg.end()
 if(len(str) > end and str[end] == ','):
 commas.append(end)

idx = 0
for comma in commas:
 parts.append(str[idx:comma])
 idx = comma + 1
parts.append(str[idx:])

print(parts)

In commas array you will have the commas that need to be splitted. Output will be:

[151, 322]

In parts you'll have the final array with the parts (Not sure if this implementation is the best way):

[
 'SUM(case when(A.money-B.money>1000 and A.unixtime-B.unixtime<=890769 and B.col10 = "A" and B.col11 = "12" and B.col12 = "V") then 10 end) as finalCond0',
 ' MAX(case when(A.money-B.money<0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "4321" and B.cond3 in ("E", "F", "G")) then A.col10 end) as finalCond1',
 ' SUM(case when(A.money-B.money>0 and A.unixtime-B.unixtime<=6786000 and B.cond1 = "A" and B.cond2 = "1234" and B.cond3 in ("A", "B", "C")) then 2 end) as finalCond2'
]

edited Mar 7 at 11:57

answered Mar 7 at 11:45

ALFA

543212

edited Mar 7 at 11:57

answered Mar 7 at 11:45

ALFA

543212

answered Mar 7 at 11:45

ALFA

543212

answered Mar 7 at 11:45

ALFA

543212

Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47

Edited for you, check now!

– ALFA
Mar 7 at 11:58

add a comment |

Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47

Edited for you, check now!

– ALFA
Mar 7 at 11:58

Is the output of this code even close to the desired output I've posted in my question?

– Old-School
Mar 7 at 11:47

Edited for you, check now!

– ALFA
Mar 7 at 11:58

add a comment |

Old-School is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Old-School is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggtcf

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

Thal And Out Agency railway station See also References External links Navigation menuOfficial Web Site of Pakistan RailwaysArchivedOfficial Web Site of Pakistan Railwayseeexpanding ite

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Thal And Out Agency railway station See also References External links Navigation menuOfficial Web Site of Pakistan RailwaysArchivedOfficial Web Site of Pakistan Railwayseeexpanding ite

3 Answers
3

3 Answers
3

3 Answers
3