- 1 Installation on Ubuntu
- 2 Installation on Debian
- 3 Installation on CentOS 7
- 4 Installation on Windows
- 5 Native Windows Command Line Version
- 6 Installation on other OS
- 7 Installation Diagnostics and Validation
Installation on Ubuntu[Bearbeiten]
sudo apt-get install mediawiki2latex mediawiki2latex
Currently version 7.30 is available from the Ubuntu package repositories. That version does not work properly with tables on recent installations of MediaWiki. To resolve the issues you may follow the instruction for the installation of version 7.33 which are given for Debian below, but also apply for Ubuntu. Since you usually can not login as root on Ubuntu systems you have to run the given command as your normal user but with a preceding "sudo " in order to get the necessary privileges to install new software on your system.
Installation on Debian[Bearbeiten]
mediawiki2latex is included in the Debian Stretch distribution and works out of the box, but the output is limited to a few pages. To work around this problem you should install mediawiki2latex version 7.33 in Stretch. Please install mediawiki2latex version 7.25 first by typing (as root).
apt-get install mediawiki2latex
Then install the build time dependencies (as root).
apt-get install ghc libghc-x509-dev libghc-pem-dev apt-get install libghc-regex-compat-dev libghc-http-dev cabal-install libghc-hxt-dev apt-get install libghc-split-dev libghc-blaze-html-dev libghc-file-embed-dev apt-get install libghc-highlighting-kate-dev libghc-hxt-http-dev libghc-regex-pcre-dev apt-get install libghc-temporary-dev libghc-url-dev libghc-utf8-string-dev apt-get install libghc-utility-ht-dev libghc-http-conduit-dev libghc-happstack-server-dev apt-get install libghc-directory-tree-dev libghc-zip-archive-dev libghc-strict-dev apt-get install libghc-network-uri-dev libghc-tagsoup-dev libghc-word8-dev apt-get install ghostscript calibre latex2rtf libreoffice
Download mediawiki2latex version 7.33 from sourceforge Download Link. Extract the archive and run (as root in the directory in which you extracted the archive from sourceforge)
Finally, to enable image conversions, make sure that ImageMagick is installed and that it has permission to transform PS and PDF files to PNG.
To install (if needed):
apt-get install imagemagick
Edit permissions (as root) /etc/ImageMagick-6/policy.xml
<policy domain="coder" rights="read|write" pattern="PS" /> <policy domain="coder" rights="none|write" pattern="PS2" /> <policy domain="coder" rights="none|write" pattern="PS3" /> <policy domain="coder" rights="none|write" pattern="EPS" /> <policy domain="coder" rights="read|write" pattern="PDF" /> <policy domain="coder" rights="read|write" pattern="XPS" />
Note, that this may entail some risks on a server machine, as explained in this piece on Solution to ImageMagick "not authorized" PDF Error by Bob Cromwell.
Installation on CentOS 7[Bearbeiten]
The instructions below apply to CentOS 7 (and likely CentOS 6). The primary concern with a CentOS 7 installation is avoid standard CentOS repository packages. Specifically the standard CentOS standard "epel" (Extra Packages for Enterprise Linux) repository contains ghc, cabal-install, and texlive, howwever, the versions in epel either provide incompatible versions (ghc and cabal) or are missing many components (texlive). Finally there are font dependancies that must be installed in order for MediaWike2LaTex to generate PDFs.
Prepare and Compile MediaWiki2LaTex[Bearbeiten]
The following versions of GHC, Cabal and Texlive are compatible MediaWiki2PDF 7.33.
- Install the latest GHC compiler. As-of 2019-01, this is available/documented at: https://copr.fedorainfracloud.org/coprs/petersen/ghc-8.0.2
- Create /etc/yum.repos.d/petersen-ghc-8.0.2-epel-7.repo:
name=Copr repo for ghc-8.0.2 owned by petersen baseurl=https://copr-be.cloud.fedoraproject.org/results/petersen/ghc-8.0.2/epel-7-$basearch/ type=rpm-md skip_if_unavailable=True gpgcheck=1 gpgkey=https://copr-be.cloud.fedoraproject.org/results/petersen/ghc-8.0.2/pubkey.gpg repo_gpgcheck=0 enabled=1 enabled_metadata=1
- yum disablerepo=epel install ghc cabal-install
- cabal update
- Download and install the latest LaTex from: http://mirror.ctan.org/systems/texlive/tlnet/install-tl-unx.tar.gz. This installer is a "live" install (it downloads install content as the install runs).
- wget http://mirror.ctan.org/systems/texlive/tlnet/install-tl-unx.tar.gz
- tar xvzf install-tl-unx.tar.gz
- cd install-tl-[build-date]
- Note the install-tl texlive is a lengthly install (5+ hours), optionally run the process in the background and disassociate it from the current login session:
- nohup sh -c "echo I | ./install-tl" > texlive-install.log 2>&1 &
- note the command avoid includes "echo I" for Install which is a required keyboard input to install-tl
- nohup will allow the install to run without being logged in.
- Download and install the latest mediawiki2latex source.
- git clone https://git.code.sf.net/p/wb2pdf/git wb2pdf-git
- cd wb2pdf-git
- cabal install
- All going well, this will result in a binary wb2pdf-git/dist/build/mediawiki2latex.
There are a fonts needed by mediawiki2latex that will not be available through the prior installation steps (e.g. GNU Freefont).
- wget http://ftp.gnu.org/gnu/freefont/freefont-ttf-20120503.zip
- unzip freefont-ttf-20120503.zip
- cd freefont-20120503
- mkdir /usr/share/fonts/truetype/freefont
- cp *.ttf /usr/share/fonts/truetype/freefont
- fc-cache -f /usr/share/fonts
- Note: initiating fc-cache is not explicitly needed, however, this is generally good practice in order to fully register fonts in CentOS
Installation on Windows[Bearbeiten]
This installation instruction is outdated. Windows is changing so frequently, that I am not going to update it. I was able to install mediawiki2latex in the ubuntu 18.04 app on Windows on 25th May 2019. But I had to recompile mediawiki2latex from source, not being able to use the makefile, but calling the compiler from the command line in the src directory after copying the document babel and latex directories from trunk to src. Good Luck!
- Go to Control Panel -> Programs -> Turn Windows Features On / Off
- The Windows Features Dialog will open
- Scroll to the bottom
- Enable Windows Subsystem for Linux
- Press OK
- Install the app "Ubuntu 18.04" from the Windows Appstore.
- When starting the app for the first time, you will be asked to set your username and password, which you need to remember
- In the app, type
sudo apt-get updateand press Enter. You will then need to enter the password you defined above
- In the app, type
sudo apt-get install mediawiki2latexand press Enter.
- The download and installation will take some time.
- Run the Ubuntu app if it is not already open.
sudo mediawiki2latex -s 80and enter the password as above.
- Keep the app open and, in Windows, open your normal web browser.
- In the address entry, type
localhost. Press Enter.
- mediawiki2latex web version is now running on your local Windows computer.
Compiling large Books
- The mediawiki2latex web server has a time limit of one hour built in, so very large books with more than about 500 pages may fail to convert. But there is a workaround:
- Go back to the Ubuntu App
- Press and hold the CTRL key and press the C key once, while holding it in order to stop the mediawiki2latex webserver
mediawiki2latex -u https://en.wikipedia.org/wiki/Homomorphism -o mybook.pdfand press Enter
- After the command finishes, open Windows Explorer on your Windows Desktop and search for the file mybook.pdf
- Double-click the file to open it in your PDF viewer
- Replace the link
https://en.wikipedia.org/wiki/Homomorphismwith the link to the large article you want to compile and repeat the above steps in order to get your desired result.
Updating the converter
- It is highly recommended to update mediawiki2latex to version 7.33 in order to work with recent releases of MediaWiki.
- Follow the steps given above in the installation instruction for Ubuntu. You will have to do this inside the Ubuntu 18.04 App.
- In order to extract the .tar.gz archive form Sourceforge, we recommend to use the 7-zip extraction software.
Native Windows Command Line Version[Bearbeiten]
We also provide an experimental command line version that runs on Windows without needing to install anything. The zip archive containing it, as well as all tools especially miktex, is called MediaWikiToLaTeX.zip . It may be downloaded from:
We do not recommend to use this native command line version but rather propose to follow the above installation instruction. We furthermore were not able to produce any results in recently patched version of the os on 25th of May 2019.
Installation on other OS[Bearbeiten]
I recommend that you use virtualbox or similar virtual machine and run Ubuntu in there. The program itself is already larger than Ubuntu, so installing Ubuntu does not add much overhead. The large size of the program is due to the many dependencies for latex packages and fonts, and the way they are packaged. In total a little over 1 GByte of packages has to be downloaded during the installation on Ubuntu. The .tar.gz archive of the source code is also available on sourceforge.
Installation Diagnostics and Validation[Bearbeiten]
A recommended step to test a mediawiki2latex install is to run the following test:
mkdir rmtest mediawiki2latex -u https://en.wikipedia.org/wiki/Book:River_martin -o rivermartin.pdf -k -c rmtest
If mediawiki2latex appears to finish and generate rivermartin.pdf, then examine rivermartin.pdf, it should be some 84 pages. If a pdf is not generated then:
- cd rmtest/document/main
- xelatex main.tex
- xelatex -interaction=nonstopmode main.tex
Review the detailed output of xelatex
Given a version of rivermartin.pdf, compare this with the open server generated version of the River_martin test case via http://mediawiki2latex-large.wmflabs.org/