Issue installing pdftotext in Python 3.6 on CentOS due to popplerInstall Poppler for Python on Macsudo pip install python-Levenshtein failed with error code 1install docker-cloud client using pippython install all in requirements.txtpyodbc and RODBC installation issuespython gcc and setuptools error during lxml installationby installing PyAudio (Python3) on my Raspberry pi 3 (noobs) I get an error, how could i fix this?libKMcuda i found this error when i install libKMcudaerror: command 'gcc' failed with exit status 1 while installing python glovecannot install pdftotext on windows because of poppler
Calculate sum of polynomial roots
How do I delete all blank lines in a buffer?
Add big quotation marks inside my colorbox
Does an advisor owe his/her student anything? Will an advisor keep a PhD student only out of pity?
Why is it that I can sometimes guess the next note?
How should I respond when I lied about my education and the company finds out through background check?
Plot of a tornado-shaped surface
Can a College of Swords bard use a Blade Flourish option on an opportunity attack provoked by their own Dissonant Whispers spell?
Does the UK parliament need to pass secondary legislation to accept the Article 50 extension
Yosemite Fire Rings - What to Expect?
Angel of Condemnation - Exile creature with second ability
Is there a RAID 0 Equivalent for RAM?
creating a ":KeepCursor" command
When were female captains banned from Starfleet?
Hero deduces identity of a killer
What are some good ways to treat frozen vegetables such that they behave like fresh vegetables when stir frying them?
Mimic lecturing on blackboard, facing audience
Using substitution ciphers to generate new alphabets in a novel
How to hide some fields of struct in C?
Why does the Sun have different day lengths, but not the gas giants?
What should you do if you miss a job interview (deliberately)?
What is the evidence for the "tyranny of the majority problem" in a direct democracy context?
What should you do when eye contact makes your subordinate uncomfortable?
Why is the "ls" command showing permissions of files in a FAT32 partition?
Issue installing pdftotext in Python 3.6 on CentOS due to poppler
Install Poppler for Python on Macsudo pip install python-Levenshtein failed with error code 1install docker-cloud client using pippython install all in requirements.txtpyodbc and RODBC installation issuespython gcc and setuptools error during lxml installationby installing PyAudio (Python3) on my Raspberry pi 3 (noobs) I get an error, how could i fix this?libKMcuda i found this error when i install libKMcudaerror: command 'gcc' failed with exit status 1 while installing python glovecannot install pdftotext on windows because of poppler
I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.
Some quick notes first:
- I'm using CentOS 6.7 on VirtualBox
- I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.
- I'm trying to get an existing application to work, so I'm not looking for an alternative to
pdftotextthe library at this time.
I followed the instructions from the github repo and already tried this step:
Fedora, Red Hat, and friends:
sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:
============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files
My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:
[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/
I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?
What I can look into?
linux python-3.x centos pdftotext poppler
add a comment |
I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.
Some quick notes first:
- I'm using CentOS 6.7 on VirtualBox
- I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.
- I'm trying to get an existing application to work, so I'm not looking for an alternative to
pdftotextthe library at this time.
I followed the instructions from the github repo and already tried this step:
Fedora, Red Hat, and friends:
sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:
============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files
My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:
[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/
I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?
What I can look into?
linux python-3.x centos pdftotext poppler
Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.
– B. Shefter
Mar 8 at 20:15
1
They're runningCentOS release 6.7 (Final)
– Michael Stackhouse
Mar 8 at 21:25
Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.
– halfer
Mar 10 at 17:08
1
@halfer I edited the response to include the full formatted text from the source.
– Michael Stackhouse
Mar 18 at 19:56
add a comment |
I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.
Some quick notes first:
- I'm using CentOS 6.7 on VirtualBox
- I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.
- I'm trying to get an existing application to work, so I'm not looking for an alternative to
pdftotextthe library at this time.
I followed the instructions from the github repo and already tried this step:
Fedora, Red Hat, and friends:
sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:
============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files
My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:
[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/
I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?
What I can look into?
linux python-3.x centos pdftotext poppler
I'm having some issues getting installing pdftotext in Python 3.6 (Anaconda 5.1.0) on CentOS.
Some quick notes first:
- I'm using CentOS 6.7 on VirtualBox
- I know it can work because my IT group has it installed on our server. NOTE: I found that our server did have the C++ wrapper installed and I'm trying to figure out how the got it.
- I'm trying to get an existing application to work, so I'm not looking for an alternative to
pdftotextthe library at this time.
I followed the instructions from the github repo and already tried this step:
Fedora, Red Hat, and friends:
sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
But the problem seems to be around poppler-cpp-devel. I don't see that package within yum search poppler:
============================= N/S Matched: poppler =============================
poppler-devel.i686 : Libraries and headers for poppler
poppler-devel.x86_64 : Libraries and headers for poppler
poppler-glib.i686 : Glib wrapper for poppler
poppler-glib.x86_64 : Glib wrapper for poppler
poppler-qt.i686 : Qt3 wrapper for poppler
poppler-qt.x86_64 : Qt3 wrapper for poppler
poppler-qt4.i686 : Qt4 wrapper for poppler
poppler-qt4.x86_64 : Qt4 wrapper for poppler
poppler.i686 : PDF rendering library
poppler.x86_64 : PDF rendering library
poppler-data.noarch : Encoding files
poppler-glib-devel.i686 : Development files for glib wrapper
poppler-glib-devel.x86_64 : Development files for glib wrapper
poppler-qt-devel.i686 : Development files for Qt3 wrapper
poppler-qt-devel.x86_64 : Development files for Qt3 wrapper
poppler-qt4-devel.i686 : Development files for Qt4 wrapper
poppler-qt4-devel.x86_64 : Development files for Qt4 wrapper
poppler-utils.x86_64 : Command line utilities for converting PDF files
My IT group gave me the instructions of what they had attempted and I tried installing poppler-devel and poppler-glib. But every time I try pip install pdftotext I'm getting the following output:
[root@localhost stack]# pip install pdftotext
Collecting pdftotext
Using cached https://files.pythonhosted.org/packages/21/35/60094dbadd9de2035873390b1cac25e01da605844eba6a07a53a82fa4adc/pdftotext-2.1.1.tar.gz
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-khm9zova --python-tag cp36:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
Installing collected packages: pdftotext
Running setup.py install for pdftotext ... error
Complete output from command /root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile:
/root/anaconda3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)
running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -I/root/anaconda3/include/python3.6m -c pdftotext.cpp -o build/temp.linux-x86_64-3.6/pdftotext.o -Wall
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
pdftotext.cpp:3:42: error: poppler/cpp/poppler-document.h: No such file or directory
pdftotext.cpp:4:40: error: poppler/cpp/poppler-global.h: No such file or directory
pdftotext.cpp:5:38: error: poppler/cpp/poppler-page.h: No such file or directory
pdftotext.cpp:20: error: ‘poppler’ has not been declared
pdftotext.cpp:20: error: ISO C++ forbids declaration of ‘document’ with no type
pdftotext.cpp:20: error: expected ‘;’ before ‘*’ token
pdftotext.cpp: In function ‘void PDF_clear(PDF*)’:
pdftotext.cpp:26: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:27: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_create_doc(PDF*)’:
pdftotext.cpp:66: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:66: error: ‘poppler’ has not been declared
pdftotext.cpp:67: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_unlock(PDF*, char*)’:
pdftotext.cpp:75: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘int PDF_init(PDF*, PyObject*, PyObject*)’:
pdftotext.cpp:105: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp: In function ‘PyObject* PDF_read_page(PDF*, int)’:
pdftotext.cpp:119: error: ‘poppler’ has not been declared
pdftotext.cpp:119: error: expected initializer before ‘*’ token
pdftotext.cpp:120: error: ‘poppler’ has not been declared
pdftotext.cpp:120: error: expected ‘;’ before ‘layout_mode’
pdftotext.cpp:123: error: ‘page’ was not declared in this scope
pdftotext.cpp:123: error: ‘struct PDF’ has no member named ‘doc’
pdftotext.cpp:129: error: ‘poppler’ has not been declared
pdftotext.cpp:129: error: expected initializer before ‘rect’
pdftotext.cpp:130: error: ‘rect’ was not declared in this scope
pdftotext.cpp:133: error: ‘layout_mode’ was not declared in this scope
pdftotext.cpp:133: error: ‘poppler’ has not been declared
pdftotext.cpp:135: error: ‘poppler’ has not been declared
pdftotext.cpp:137: error: ‘poppler’ has not been declared
pdftotext.cpp:138: error: type ‘<type error>’ argument given to ‘delete’, expected pointer
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "/root/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-1mu2f1n2/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-ghuhvuhl/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-1mu2f1n2/pdftotext/
I'm assuming the problem here is that it's looking for the C++ compiled files and I could only get the glib?
What I can look into?
linux python-3.x centos pdftotext poppler
linux python-3.x centos pdftotext poppler
edited Mar 10 at 17:06
halfer
14.7k759116
14.7k759116
asked Mar 8 at 1:57
Michael StackhouseMichael Stackhouse
134
134
Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.
– B. Shefter
Mar 8 at 20:15
1
They're runningCentOS release 6.7 (Final)
– Michael Stackhouse
Mar 8 at 21:25
Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.
– halfer
Mar 10 at 17:08
1
@halfer I edited the response to include the full formatted text from the source.
– Michael Stackhouse
Mar 18 at 19:56
add a comment |
Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.
– B. Shefter
Mar 8 at 20:15
1
They're runningCentOS release 6.7 (Final)
– Michael Stackhouse
Mar 8 at 21:25
Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.
– halfer
Mar 10 at 17:08
1
@halfer I edited the response to include the full formatted text from the source.
– Michael Stackhouse
Mar 18 at 19:56
Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.
– B. Shefter
Mar 8 at 20:15
Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.
– B. Shefter
Mar 8 at 20:15
1
1
They're running
CentOS release 6.7 (Final)– Michael Stackhouse
Mar 8 at 21:25
They're running
CentOS release 6.7 (Final)– Michael Stackhouse
Mar 8 at 21:25
Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.
– halfer
Mar 10 at 17:08
Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.
– halfer
Mar 10 at 17:08
1
1
@halfer I edited the response to include the full formatted text from the source.
– Michael Stackhouse
Mar 18 at 19:56
@halfer I edited the response to include the full formatted text from the source.
– Michael Stackhouse
Mar 18 at 19:56
add a comment |
2 Answers
2
active
oldest
votes
pdftotext should be in poppler-utils, so try yum install poppler-utils
EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)
Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.
I don't suppose you're interested in upgrading to CentOS 7?
1
I tried that one too but it's just the command line utilities. The Python librarypdftotextdistinctly seems to rely on the C++ wrapper forpoppler.
– Michael Stackhouse
Mar 8 at 15:17
1
ForpypopplerI tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to thepdftotextgithub repo detailing my solution once I figure it out too.
– Michael Stackhouse
Mar 8 at 21:18
add a comment |
I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.
Following the instructions from this repo:
On CentOS
On CentOS the
libpoppler-cpplibrary is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.# Build dependencies
yum install wget xz libjpeg-devel openjpeg2-devel
# Download and extract
wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
tar -Jxvf poppler-0.47.0.tar.xz
cd poppler-0.47.0
# Build and install
./configure
make
sudo make install
By default libraries get installed in
/usr/local/liband/usr/local/include. On CentOS this is not a default search path so we need to setPKG_CONFIG_PATHandLD_LIBRARY_PATHto point R to the right directory:export LD_LIBRARY_PATH="/usr/local/lib"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055663%2fissue-installing-pdftotext-in-python-3-6-on-centos-due-to-poppler%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
pdftotext should be in poppler-utils, so try yum install poppler-utils
EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)
Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.
I don't suppose you're interested in upgrading to CentOS 7?
1
I tried that one too but it's just the command line utilities. The Python librarypdftotextdistinctly seems to rely on the C++ wrapper forpoppler.
– Michael Stackhouse
Mar 8 at 15:17
1
ForpypopplerI tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to thepdftotextgithub repo detailing my solution once I figure it out too.
– Michael Stackhouse
Mar 8 at 21:18
add a comment |
pdftotext should be in poppler-utils, so try yum install poppler-utils
EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)
Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.
I don't suppose you're interested in upgrading to CentOS 7?
1
I tried that one too but it's just the command line utilities. The Python librarypdftotextdistinctly seems to rely on the C++ wrapper forpoppler.
– Michael Stackhouse
Mar 8 at 15:17
1
ForpypopplerI tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to thepdftotextgithub repo detailing my solution once I figure it out too.
– Michael Stackhouse
Mar 8 at 21:18
add a comment |
pdftotext should be in poppler-utils, so try yum install poppler-utils
EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)
Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.
I don't suppose you're interested in upgrading to CentOS 7?
pdftotext should be in poppler-utils, so try yum install poppler-utils
EDIT: Hmm. There's a package called pypoppler available for CentOS 6 in the EPEL repository, which describes itself as "Python bindings for the Poppler PDF rendering library." I see no indication that it includes poppler/cpp/anything, but you can give it a try. (You may need to install pycairo first.)
Failing that, you might try installing an earlier version of pdftotext (e.g. pip install pdftotext==1.0.0) to find one compatible with CentOS 6. The earliest version came out in June of 2017, though, so that may not help.
I don't suppose you're interested in upgrading to CentOS 7?
edited Mar 8 at 21:09
answered Mar 8 at 3:40
B. ShefterB. Shefter
389111
389111
1
I tried that one too but it's just the command line utilities. The Python librarypdftotextdistinctly seems to rely on the C++ wrapper forpoppler.
– Michael Stackhouse
Mar 8 at 15:17
1
ForpypopplerI tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to thepdftotextgithub repo detailing my solution once I figure it out too.
– Michael Stackhouse
Mar 8 at 21:18
add a comment |
1
I tried that one too but it's just the command line utilities. The Python librarypdftotextdistinctly seems to rely on the C++ wrapper forpoppler.
– Michael Stackhouse
Mar 8 at 15:17
1
ForpypopplerI tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to thepdftotextgithub repo detailing my solution once I figure it out too.
– Michael Stackhouse
Mar 8 at 21:18
1
1
I tried that one too but it's just the command line utilities. The Python library
pdftotext distinctly seems to rely on the C++ wrapper for poppler.– Michael Stackhouse
Mar 8 at 15:17
I tried that one too but it's just the command line utilities. The Python library
pdftotext distinctly seems to rely on the C++ wrapper for poppler.– Michael Stackhouse
Mar 8 at 15:17
1
1
For
pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.– Michael Stackhouse
Mar 8 at 21:18
For
pypoppler I tried that too to no avail. And no option to upgrade the OS since I'm trying to emulate my company server. I'll post and edit but my IT got back to me and they manually installed the CPP version so I asked how they did it. I plan to try to submit a pull request to the pdftotext github repo detailing my solution once I figure it out too.– Michael Stackhouse
Mar 8 at 21:18
add a comment |
I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.
Following the instructions from this repo:
On CentOS
On CentOS the
libpoppler-cpplibrary is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.# Build dependencies
yum install wget xz libjpeg-devel openjpeg2-devel
# Download and extract
wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
tar -Jxvf poppler-0.47.0.tar.xz
cd poppler-0.47.0
# Build and install
./configure
make
sudo make install
By default libraries get installed in
/usr/local/liband/usr/local/include. On CentOS this is not a default search path so we need to setPKG_CONFIG_PATHandLD_LIBRARY_PATHto point R to the right directory:export LD_LIBRARY_PATH="/usr/local/lib"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
add a comment |
I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.
Following the instructions from this repo:
On CentOS
On CentOS the
libpoppler-cpplibrary is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.# Build dependencies
yum install wget xz libjpeg-devel openjpeg2-devel
# Download and extract
wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
tar -Jxvf poppler-0.47.0.tar.xz
cd poppler-0.47.0
# Build and install
./configure
make
sudo make install
By default libraries get installed in
/usr/local/liband/usr/local/include. On CentOS this is not a default search path so we need to setPKG_CONFIG_PATHandLD_LIBRARY_PATHto point R to the right directory:export LD_LIBRARY_PATH="/usr/local/lib"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
add a comment |
I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.
Following the instructions from this repo:
On CentOS
On CentOS the
libpoppler-cpplibrary is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.# Build dependencies
yum install wget xz libjpeg-devel openjpeg2-devel
# Download and extract
wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
tar -Jxvf poppler-0.47.0.tar.xz
cd poppler-0.47.0
# Build and install
./configure
make
sudo make install
By default libraries get installed in
/usr/local/liband/usr/local/include. On CentOS this is not a default search path so we need to setPKG_CONFIG_PATHandLD_LIBRARY_PATHto point R to the right directory:export LD_LIBRARY_PATH="/usr/local/lib"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
I found the solution to this. By following the instructions for installing libpoppler-cpp from this link, I was able to successfully install the pdftotext.
Following the instructions from this repo:
On CentOS
On CentOS the
libpoppler-cpplibrary is not included with the system so we need to build from source. Note that recent versions of poppler require C++11 which is not available on CentOS, so we build a slightly older version of libpoppler.# Build dependencies
yum install wget xz libjpeg-devel openjpeg2-devel
# Download and extract
wget https://poppler.freedesktop.org/poppler-0.47.0.tar.xz
tar -Jxvf poppler-0.47.0.tar.xz
cd poppler-0.47.0
# Build and install
./configure
make
sudo make install
By default libraries get installed in
/usr/local/liband/usr/local/include. On CentOS this is not a default search path so we need to setPKG_CONFIG_PATHandLD_LIBRARY_PATHto point R to the right directory:export LD_LIBRARY_PATH="/usr/local/lib"
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
edited Mar 18 at 22:50
halfer
14.7k759116
14.7k759116
answered Mar 8 at 22:01
Michael StackhouseMichael Stackhouse
134
134
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055663%2fissue-installing-pdftotext-in-python-3-6-on-centos-due-to-poppler%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is the IT group's server running CentOS 7, by any chance? It looks as if poppler-cpp-devel is available on CentOS 7 but not on CentOS 6.
– B. Shefter
Mar 8 at 20:15
1
They're running
CentOS release 6.7 (Final)– Michael Stackhouse
Mar 8 at 21:25
Thanks for adding a self-answer here. Unfortunately that needed to be deleted, as we don't accept link-only answers. If you can expand it so the instructions are in the answer itself, with suitable attributions as necessary, that would be welcome.
– halfer
Mar 10 at 17:08
1
@halfer I edited the response to include the full formatted text from the source.
– Michael Stackhouse
Mar 18 at 19:56