Commits

John Lenz  committed 0f8fdab

Add post on bibtex and gitit

  • Participants
  • Parent commits 9d27b16

Comments (0)

Files changed (2)

File posts/2012-06-26-bibtex-and-gitit.markdown

+---
+title: BibTeX and Gitit
+author: John Lenz
+tags: haskell, latex
+date: June 26, 2012
+---
+
+This post is part four (and the planned final part) in my series on managing LaTeX references using
+pandoc. In the [first post](2012-06-14-reference-management.html), I described at a high level how I
+manage my LaTeX references.  The [second](2012-06-15-bibtex-and-pandoc.html) and
+[third](2012-06-19-bibtex-and-pandoc-2.html) posts contain some haskell code to manage these marked
+up BibTeX files.  In this post, I describe how I use [gitit](http://gitit.net/) to browse and edit
+these files in a web browser.
+
+As a side note, after writing the [second](2012-06-15-bibtex-and-pandoc.html) and
+[third](2012-06-19-bibtex-and-pandoc-2.html) posts, I am considering posting the code from them on
+hackage so people can use these programs without copying code from blog posts.  Also, haskell has
+the nice feature that compiled binaries can be run without having haskell installed.  (Pandoc uses
+this to provide a windows binary.)  So I could also provide binary downloads for the tools which
+could be used as is without installing haskell.  At the moment, I haven't done that since you need a
+little familiarity with the command line and mercurial to work with these files, and for people who
+know their way around the command line, running "cabal install" or "runhaskell" isn't that hard.
+
+# Gitit Overview
+
+Well, back to the content of this post.  [Gitit](http://gitit.net/) is another really nice program
+built around [pandoc](http://johnmacfarlane.net/pandoc) that we can leverage to work with our BibTeX
+files.  Gitit is a web server that provides a wiki where the pages and their history are stored in
+git, mercurial, or darcs.  The syntax of the pages is markdown and pandoc is used to convert the
+pages to HTML for display.  This should sound very similar to [the third
+post](2012-06-19-bibtex-and-pandoc-2.html).  Gitit also supports plugins, so the idea is to take the
+code from the third post and make it a gitit plugin.
+
+The advantages of gitit are:
+
+* it converts the marked up BibTeX files to HTML on demand so we don't need to run a tool,
+* allow browsing of the marked up BibTeX from any computer,
+* allow editing and previewing from the web browser,
+* since the marked up BibTeX files I write are sometimes mini surveys, browsing them from
+  the web is nice when I am trying to refresh my memory about theorems or open problems in some
+  area.  Also, math formulas are rendered nicely.
+
+The disadvantages are:
+
+* There is a bunch of setup and configuration required.
+* Mercurial [hooks](http://mercurial.selenic.com/wiki/Hook) can be used to run the code from the
+  [third post](2012-06-19-bibtex-and-pandoc-2.html) on update and commit, or a shell script can be
+  written to run the code. This provides almost all the same advantages for much less setup cost.
+* The benefit is only worth the setup cost if the server is accessible on
+  the internet, and configuring and maintaining a server has a high overhead.
+
+Since I am already running a web server on the computer in my home, the overhead for me isn't that
+much. But even with gitit set up and running, most of the time I edit with Vim locally and push with
+mercurial instead of through the gitit web interface.  (Note that gitit is nice if you push commits
+over ssh the web pages are automatically updated.)  If I wasn't already running a server, I might
+not bother with gitit and instead work out something with mercurial hooks or hakyll or something
+like that.
+
+## Gitit Install and Config
+
+Install gitit using the normal
+[instructions](http://gitit.net/README#compiling-and-installing-gitit).  Next, edit the gitit config
+file and switch to mercurial and perhaps edit the option for default math.  Then you will want to
+push the pages to the gitit repo.  I skipped setting up caching since there is hardly any traffic. I
+also proxy gitit from apache over ssl as described in the README.  I set http authentication
+in the gitit config and then in the apache config use `AuthType Basic` with a manually updated list
+of users and passwords.  At the moment, I allow browsing without authentication but only
+authenticated users are allowed to edit.  This does mean anyone can browse my pages, but this is
+useful sometimes for example I can send links to collaborators.  But I still want these pages to be
+semi-private, so I forbid search engines from indexing the content using
+[robots.txt](http://en.wikipedia.org/wiki/Robots.txt).  You might consider instead just requiring
+logins to both view and edit content.
+
+## Plugin
+
+Once all the above configuration is done and working, gitit will serve and allow editing of pages.
+The only downside is the raw BibTeX will be shown in the code blocks.  Luckily, gitit supports
+plugins and we have already written all the code!  All we need to do is take the code from [this
+post](2012-06-19-bibtex-and-pandoc-2.html), delete the `wOptions`, `processFile`, and `main`
+functions, import `Network.Gitit.Interface`, and add the following:
+
+~~~ {.haskell}
+plugin :: Plugin 
+plugin = mkPageTransform transformBlock 
+~~~
+
+For your convenience, you can download the resulting file [here](/code/BibtexGitit.hs).  Drop that file
+in your gitit plugins directory and add the plugin to the gitit config.

File static/code/BibtexGitit.hs

+{-
+Copyright (C) 2012 John Lenz <lenz@math.uic.edu>
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification,
+are permitted provided that the following conditions are met:
+
+Redistributions of source code must retain the above copyright notice, this list
+of conditions and the following disclaimer.  Redistributions in binary form must
+reproduce the above copyright notice, this list of conditions and the following
+disclaimer in the documentation and/or other materials provided with the
+distribution.  Neither the name of John Lenz nor the names of its contributors
+may be used to endorse or promote products derived from this software without
+specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
+ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+-}
+
+module BibtexGitit where
+
+-- For documentation, see
+-- * http://blog.wuzzeb.org/posts/2012-06-19-bibtex-and-pandoc-2.html
+-- * http://blog.wuzzeb.org/posts/2012-06-26-bibtex-and-gitit.html
+
+import Control.Applicative ((<$>))
+import Control.Exception (throw)
+import Control.Monad (liftM)
+import Data.Maybe (fromMaybe)
+import Data.Char (toLower, isAlpha)
+import qualified Data.List as L
+import qualified Data.List.Split as S
+import System.FilePath (replaceExtension)
+import System.Environment (getArgs)
+import qualified Text.ParserCombinators.Parsec as P
+import Text.ParserCombinators.Parsec ((<|>))
+import Text.Pandoc
+
+import Network.Gitit.Interface
+
+data Bibtex = Bibtex String [(String,String)]
+
+bibParser :: P.Parser [Bibtex]
+bibParser = do
+  x <- P.sepEndBy bibEntry P.spaces
+  P.eof
+  return x
+
+bibEntry :: P.Parser Bibtex
+bibEntry = do
+  P.char '@'
+  P.many $ P.noneOf "{"
+  P.char '{'
+  name <- P.many $ P.noneOf ","
+  P.many $ P.oneOf " \t\n,"
+  attrs <- P.sepEndBy bibAttr $ P.many1 $ P.oneOf " \t\n,"
+  P.char '}'
+  return $ Bibtex name attrs
+
+bibAttr :: P.Parser (String, String)
+bibAttr = do
+  key <- P.many (P.letter <|> P.digit)
+  P.spaces
+  P.char '='
+  P.spaces
+  P.char '{'
+  val <- bibVal
+  P.char '}'
+  return (map toLower key, val)
+
+bibVal :: P.Parser String
+bibVal = liftM concat $ P.many1 (bibValMatched <|> (liftM (:[]) (P.noneOf "{}")))
+
+bibValMatched :: P.Parser String
+bibValMatched = P.between (P.char '{') (P.char '}') bibVal
+
+renderEntries :: [Bibtex] -> Block
+renderEntries lst = DefinitionList $ map display lst'
+    where lst' = L.sortBy (\(Bibtex a _) (Bibtex b _) -> compare a b) lst
+          display (Bibtex key b) = ([Strong [Str key]], [[Plain $ renderEntry key b]])
+
+type BibtexAttr = [(String, String)]
+
+render1 :: BibtexAttr -> String -> Inline
+render1 b s = case lookup s b of
+    Just x  -> Str x
+    Nothing -> Str ""
+
+articleLink :: String -> BibtexAttr -> Inline
+articleLink s b = case lookup s b of
+                  Just x  -> Link [Str "article"] (x, [])
+                  Nothing -> Str ""
+
+mrNumber :: BibtexAttr -> Inline
+mrNumber b = case lookup "mrnumber" b of
+                Just x -> mkURL x
+                Nothing -> Str ""
+  where mkURL x | length x > 2 = Link [Str "MathSciNet"] (mathSciNet ++ mrNum x, [])
+        mkURL _ = Str ""
+        mrNum = dropWhile isAlpha . head . words
+        mathSciNet = "http://www.ams.org/mathscinet/search/publdoc.html?pg1=MR&s1="
+
+arxiv :: BibtexAttr -> Inline
+arxiv b = case lookup "arxiv" b of
+                 Just x  -> mkURL x
+                 Nothing -> Str ""
+ where mkURL x = Link [Str "arXiv"] (url ++ dropWhile isAlpha x, [])
+       url = "http://arxiv.org/abs/"
+
+expandTex :: String -> String
+expandTex ('\\':a:'{':b:'}':xs) = expandTex ('\\':a:b:xs)
+expandTex ('\\':'\'':a:xs) = a' : expandTex xs
+   where a' = case a of
+               'a' -> 'á'
+               'e' -> 'é'
+               'o' -> 'ó'
+               _   -> a
+expandTex ('\\':'H':'o':xs) = 'ő' : expandTex xs
+expandTex ('\\':'"':a:xs) = a' : expandTex xs
+   where a' = case a of
+               'a' -> 'ä'
+               'e' -> 'ë'
+               'o' -> 'ö'
+               _   -> a
+expandTex (a:xs) = a : expandTex xs
+expandTex [] = []
+
+prettyAuthor :: String -> String
+prettyAuthor x = L.intercalate ", " $ map fixOne $ S.splitOn " and" x
+  where fixOne s = case S.splitOn "," s of
+                    []     -> ""
+                    [a]    -> a
+                    (f:xs) -> concat xs ++ " " ++ f
+
+renderEntry :: String -> BibtexAttr -> [Inline]
+renderEntry name b = raw ++ entries
+  where raw = [(RawInline "html" $ "<a name=\"" ++ name ++ "\"></a>")]
+
+        entries = L.intersperse (Str ", ") $ filter (not . isEmptyStr)
+            [ mapInline (prettyAuthor . expandTex) $ render1 b "author"
+            , mapInline (\a -> "\"" ++ a ++ "\"") $ render1 b "title"
+            , Emph [render1 b "journal"]
+            , render1 b "year"
+            , mrNumber b
+            , articleLink "url" b
+            , articleLink "url2" b
+            , arxiv b
+            ]
+
+        mapInline f (Str s) = Str $ f s
+        mapInline _ x = x
+        isEmptyStr (Str "") = True
+        isEmptyStr _        = False
+
+transformBlock :: Block -> Block
+transformBlock (CodeBlock (_, classes, namevals) contents) | "bib" `elem` classes =
+        case P.parse bibParser "" contents of
+           Left err -> BlockQuote [Para [Str $ "Error parsing bib data: " ++ show err]]
+           Right x -> renderEntries x
+transformBlock x = x
+
+plugin :: Plugin
+plugin = mkPageTransform transformBlock