Calculate CRC32, MD5 and SHA1 of zip content without decompression in Python2019 Community Moderator ElectionSHA1 vs md5 vs SHA256: which to use for a PHP login?Is calculating an MD5 hash less CPU intensive than SHA family functions?MD5 and SHA1 C++ hashing libraryCRC32+Size vs MD5/SHA1md5 hash or crc32 which one to use in this caseCalculate MD5 checksum for a filesha1, crc32, and md5 how to read this data?split a FileStream for multiple consumers in C#Hashing passwords with MD5, SHA1 and MD5 over SHA1Calculate MD5 and SHA1 simultaneously on large file
If there are any 3nion, 5nion, 7nion, 9nion, 10nion, etc.
Why won't the strings command stop?
Make me a metasequence
Is there a limit on the maximum number of future jobs queued in an org?
Why are special aircraft used for the carriers in the United States Navy?
Where is the fallacy here?
What is better: yes / no radio, or simple checkbox?
How can I be pwned if I'm not registered on the compromised site?
When do _WA_Sys_ statistics Get Updated?
For the Kanji 校 is the fifth stroke connected to the sixth stroke?
Why did the Cray-1 have 8 parity bits per word?
How to substitute values from a list into a function?
I can't die. Who am I?
Does "legal poaching" exist?
Create chunks from an array
Four buttons on a table
Should I use HTTPS on a domain that will only be used for redirection?
"seeing as you don't know anyone but me" meaning in this context
Fake utcnow for the pytest
If nine coins are tossed, what is the probability that the number of heads is even?
Giving a talk in my old university, how prominently should I tell students my salary?
Movie: Scientists travel to the future to avoid nuclear war, last surviving one is used as fuel by future humans
Why is it "take a leak?"
Should we avoid writing fiction about historical events without extensive research?
Calculate CRC32, MD5 and SHA1 of zip content without decompression in Python
2019 Community Moderator ElectionSHA1 vs md5 vs SHA256: which to use for a PHP login?Is calculating an MD5 hash less CPU intensive than SHA family functions?MD5 and SHA1 C++ hashing libraryCRC32+Size vs MD5/SHA1md5 hash or crc32 which one to use in this caseCalculate MD5 checksum for a filesha1, crc32, and md5 how to read this data?split a FileStream for multiple consumers in C#Hashing passwords with MD5, SHA1 and MD5 over SHA1Calculate MD5 and SHA1 simultaneously on large file
I need to calculate the CRC32, MD5 and SHA1 of the content of zip files without decompressing them.
So far I found out how to calculate these for the zip files itself, e.g.:
CRC32:
import zlib
zip_name = "test.zip"
def Crc32Hasher(file_path):
buf_size = 65536
crc32 = 0
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
crc32 = zlib.crc32(data, crc32)
return format(crc32 & 0xFFFFFFFF, '08x')
print(Crc32Hasher(zip_name))
SHA1: (MD5 similarly)
import hashlib
zip_name = "test.zip"
def Sha1Hasher(file_path):
buf_size = 65536
sha1 = hashlib.sha1()
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
sha1.update(data)
return format(sha1.hexdigest())
print(Sha1Hasher(zip_name))
For the content of the zip file, I can read the CRC32 from the zip directly without the need of calculating it as follow:
Read CRC32 of zip content:
import zipfile
zip_name = "test.zip"
if zip_name.lower().endswith(('.zip')):
z = zipfile.ZipFile(zip_name, "r")
for info in z.infolist():
print(info.filename,
format(info.CRC & 0xFFFFFFFF, '08x'))
But I couldn't figure out how to calculate the SHA1 (or MD5) of the content of zip files without decompressing them first.
Is that somehow possible?
python hash md5 sha1 crc32
add a comment |
I need to calculate the CRC32, MD5 and SHA1 of the content of zip files without decompressing them.
So far I found out how to calculate these for the zip files itself, e.g.:
CRC32:
import zlib
zip_name = "test.zip"
def Crc32Hasher(file_path):
buf_size = 65536
crc32 = 0
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
crc32 = zlib.crc32(data, crc32)
return format(crc32 & 0xFFFFFFFF, '08x')
print(Crc32Hasher(zip_name))
SHA1: (MD5 similarly)
import hashlib
zip_name = "test.zip"
def Sha1Hasher(file_path):
buf_size = 65536
sha1 = hashlib.sha1()
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
sha1.update(data)
return format(sha1.hexdigest())
print(Sha1Hasher(zip_name))
For the content of the zip file, I can read the CRC32 from the zip directly without the need of calculating it as follow:
Read CRC32 of zip content:
import zipfile
zip_name = "test.zip"
if zip_name.lower().endswith(('.zip')):
z = zipfile.ZipFile(zip_name, "r")
for info in z.infolist():
print(info.filename,
format(info.CRC & 0xFFFFFFFF, '08x'))
But I couldn't figure out how to calculate the SHA1 (or MD5) of the content of zip files without decompressing them first.
Is that somehow possible?
python hash md5 sha1 crc32
add a comment |
I need to calculate the CRC32, MD5 and SHA1 of the content of zip files without decompressing them.
So far I found out how to calculate these for the zip files itself, e.g.:
CRC32:
import zlib
zip_name = "test.zip"
def Crc32Hasher(file_path):
buf_size = 65536
crc32 = 0
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
crc32 = zlib.crc32(data, crc32)
return format(crc32 & 0xFFFFFFFF, '08x')
print(Crc32Hasher(zip_name))
SHA1: (MD5 similarly)
import hashlib
zip_name = "test.zip"
def Sha1Hasher(file_path):
buf_size = 65536
sha1 = hashlib.sha1()
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
sha1.update(data)
return format(sha1.hexdigest())
print(Sha1Hasher(zip_name))
For the content of the zip file, I can read the CRC32 from the zip directly without the need of calculating it as follow:
Read CRC32 of zip content:
import zipfile
zip_name = "test.zip"
if zip_name.lower().endswith(('.zip')):
z = zipfile.ZipFile(zip_name, "r")
for info in z.infolist():
print(info.filename,
format(info.CRC & 0xFFFFFFFF, '08x'))
But I couldn't figure out how to calculate the SHA1 (or MD5) of the content of zip files without decompressing them first.
Is that somehow possible?
python hash md5 sha1 crc32
I need to calculate the CRC32, MD5 and SHA1 of the content of zip files without decompressing them.
So far I found out how to calculate these for the zip files itself, e.g.:
CRC32:
import zlib
zip_name = "test.zip"
def Crc32Hasher(file_path):
buf_size = 65536
crc32 = 0
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
crc32 = zlib.crc32(data, crc32)
return format(crc32 & 0xFFFFFFFF, '08x')
print(Crc32Hasher(zip_name))
SHA1: (MD5 similarly)
import hashlib
zip_name = "test.zip"
def Sha1Hasher(file_path):
buf_size = 65536
sha1 = hashlib.sha1()
with open(file_path, 'rb') as f:
while True:
data = f.read(buf_size)
if not data:
break
sha1.update(data)
return format(sha1.hexdigest())
print(Sha1Hasher(zip_name))
For the content of the zip file, I can read the CRC32 from the zip directly without the need of calculating it as follow:
Read CRC32 of zip content:
import zipfile
zip_name = "test.zip"
if zip_name.lower().endswith(('.zip')):
z = zipfile.ZipFile(zip_name, "r")
for info in z.infolist():
print(info.filename,
format(info.CRC & 0xFFFFFFFF, '08x'))
But I couldn't figure out how to calculate the SHA1 (or MD5) of the content of zip files without decompressing them first.
Is that somehow possible?
python hash md5 sha1 crc32
python hash md5 sha1 crc32
asked May 22 '17 at 3:55
paradadfparadadf
486
486
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
It is not possible. You can get CRC because it was carefully precalculated for you when archive is created (it is used for integrity check). Any other checksum/hash has to be calculated from scratch and will require at least streaming of the archive content, i.e. unpacking.
UPD: Possibble implementations
libarchive: extra dependencies, supports many archive formats
import libarchive.public as libarchive
with libarchive.file_reader(fname) as archive:
for entry in archive:
md5 = hashlib.md5()
for block in entry.get_blocks():
md5.update(block)
print(str(entry), md5.hexdigest())
Native zipfile: no dependencies, zip only
import zipfile
archive = zipfile.ZipFile(fname)
blocksize = 1024**2 #1M chunks
for fname in archive.namelist():
entry = archive.open(fname)
md5 = hashlib.md5()
while True:
block = entry.read(blocksize)
if not block:
break
md5.update(block)
print(fname, md5.hexdigest())
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
1
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44104426%2fcalculate-crc32-md5-and-sha1-of-zip-content-without-decompression-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
It is not possible. You can get CRC because it was carefully precalculated for you when archive is created (it is used for integrity check). Any other checksum/hash has to be calculated from scratch and will require at least streaming of the archive content, i.e. unpacking.
UPD: Possibble implementations
libarchive: extra dependencies, supports many archive formats
import libarchive.public as libarchive
with libarchive.file_reader(fname) as archive:
for entry in archive:
md5 = hashlib.md5()
for block in entry.get_blocks():
md5.update(block)
print(str(entry), md5.hexdigest())
Native zipfile: no dependencies, zip only
import zipfile
archive = zipfile.ZipFile(fname)
blocksize = 1024**2 #1M chunks
for fname in archive.namelist():
entry = archive.open(fname)
md5 = hashlib.md5()
while True:
block = entry.read(blocksize)
if not block:
break
md5.update(block)
print(fname, md5.hexdigest())
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
1
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
add a comment |
It is not possible. You can get CRC because it was carefully precalculated for you when archive is created (it is used for integrity check). Any other checksum/hash has to be calculated from scratch and will require at least streaming of the archive content, i.e. unpacking.
UPD: Possibble implementations
libarchive: extra dependencies, supports many archive formats
import libarchive.public as libarchive
with libarchive.file_reader(fname) as archive:
for entry in archive:
md5 = hashlib.md5()
for block in entry.get_blocks():
md5.update(block)
print(str(entry), md5.hexdigest())
Native zipfile: no dependencies, zip only
import zipfile
archive = zipfile.ZipFile(fname)
blocksize = 1024**2 #1M chunks
for fname in archive.namelist():
entry = archive.open(fname)
md5 = hashlib.md5()
while True:
block = entry.read(blocksize)
if not block:
break
md5.update(block)
print(fname, md5.hexdigest())
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
1
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
add a comment |
It is not possible. You can get CRC because it was carefully precalculated for you when archive is created (it is used for integrity check). Any other checksum/hash has to be calculated from scratch and will require at least streaming of the archive content, i.e. unpacking.
UPD: Possibble implementations
libarchive: extra dependencies, supports many archive formats
import libarchive.public as libarchive
with libarchive.file_reader(fname) as archive:
for entry in archive:
md5 = hashlib.md5()
for block in entry.get_blocks():
md5.update(block)
print(str(entry), md5.hexdigest())
Native zipfile: no dependencies, zip only
import zipfile
archive = zipfile.ZipFile(fname)
blocksize = 1024**2 #1M chunks
for fname in archive.namelist():
entry = archive.open(fname)
md5 = hashlib.md5()
while True:
block = entry.read(blocksize)
if not block:
break
md5.update(block)
print(fname, md5.hexdigest())
It is not possible. You can get CRC because it was carefully precalculated for you when archive is created (it is used for integrity check). Any other checksum/hash has to be calculated from scratch and will require at least streaming of the archive content, i.e. unpacking.
UPD: Possibble implementations
libarchive: extra dependencies, supports many archive formats
import libarchive.public as libarchive
with libarchive.file_reader(fname) as archive:
for entry in archive:
md5 = hashlib.md5()
for block in entry.get_blocks():
md5.update(block)
print(str(entry), md5.hexdigest())
Native zipfile: no dependencies, zip only
import zipfile
archive = zipfile.ZipFile(fname)
blocksize = 1024**2 #1M chunks
for fname in archive.namelist():
entry = archive.open(fname)
md5 = hashlib.md5()
while True:
block = entry.read(blocksize)
if not block:
break
md5.update(block)
print(fname, md5.hexdigest())
edited 6 hours ago
Steve Barnes
20.8k43852
20.8k43852
answered May 22 '17 at 4:04
MaratMarat
4,02811931
4,02811931
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
1
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
add a comment |
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
1
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
thanks for answering. What would be the most memory efficient way of doing that?
– paradadf
May 23 '17 at 16:17
1
1
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
@paradadf updated the answer
– Marat
May 25 '17 at 18:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44104426%2fcalculate-crc32-md5-and-sha1-of-zip-content-without-decompression-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown