Commits

Patrick Mézard  committed ab48188 Merge

Merge with r506

  • Participants
  • Parent commits 3429d86, 616da1b

Comments (0)

Files changed (8)

File cpp/CMakeLists.txt

   "src/phonenumbers/geocoding/area_code_map.cc"
   "src/phonenumbers/geocoding/default_map_storage.cc"
   "src/phonenumbers/geocoding/geocoding_data.cc"
+  "src/phonenumbers/geocoding/mapping_file_provider.cc"
   "src/phonenumbers/logger.cc"
   "src/phonenumbers/metadata.h"          # Generated by build tools.
   "src/phonenumbers/phonemetadata.pb.cc" # Generated by Protocol Buffers.
   "test/phonenumbers/geocoding/area_code_map_test.cc"
   "test/phonenumbers/geocoding/geocoding_data_test.cc"
   "test/phonenumbers/geocoding/geocoding_test_data.cc"
+  "test/phonenumbers/geocoding/mapping_file_provider_test.cc"
   "test/phonenumbers/logger_test.cc"
   "test/phonenumbers/phonenumberutil_test.cc"
   "test/phonenumbers/regexp_adapter_test.cc"
     You can install it very easily on a Debian-based GNU/Linux distribution:
     $ sudo apt-get install cmake
 
+    Additionally it is recommended you install the ccmake configuration tool:
+    $ sudo apt-get install cmake-curses-gui
+
   - Protocol Buffers
     http://code.google.com/p/protobuf/
     Version 2.4 or more recent is required.
     $ make testinstall
 
   - ICU
-    Version 4.4 or more recent is required.
-    It can be built from sources. You need to download the source tarball at
-    this location:
-    http://site.icu-project.org/download
-    Then you can extract, build and install it this way:
-    $ tar xzf icu4c-4_4_2-src.tgz
+    Version 4.4 or more recent is required. It can be installed easily on Debian
+    systems or be built from the most recent sources (currently 49.1.2).
+
+    If you have a Debian-based distribution you can check which version of the
+    ICU libraries is available by doing:
+    $ apt-cache show libicu-dev
+    And looking for the "Version:" string.
+
+    If this is above 4.4 then you can just do:
+    $ sudo apt-get install libicu-dev
+
+    Otherwise you need to download the source tarball for the latest version
+    from:
+      http://site.icu-project.org/download
+    And then extract it via:
+    $ tar xzf icu4c-49_1_2-src.tgz
+
+    Alternatively you can export the SVN repository to the current directory
+    via:
+    $ svn export http://source.icu-project.org/repos/icu/icu/tags/release-49-1-2/
+
+    Having acquired the latest sources, make and install it via:
     $ cd icu/source
     $ ./configure && make && sudo make install
 
-    If you have a Debian-based distribution providing an up-to-date version of
-    ICU, you can install it using apt-get:
-    $ sudo apt-get install libicu-dev
-
   - Boost
     Version 1.40 or more recent is required.
 
     Note: Boost Thread is the only library needed at link time.
 
 How to build libphonenumber C++:
-  $ cd libphonenumber
+  $ cd libphonenumber/cpp
   $ mkdir build
   $ cd build
   $ cmake ..
   $ make
 
+Troubleshooting CMake via ccmake:
+  Follow these instructions if the build steps above don't work for you.
+
+  - Incorrect protocol buffer library issues
+    If the build process complains that the version of protoc being used is too
+    old or that it cannot find the correct libprotobuf library, you may need to
+    change the library path of the project.
+
+    This issue should typically only occur in cases where you have two (or more)
+    versions of the protocol buffer libraries installed on your system. This
+    step assumes that you have already manually downloaded and installed the
+    protocol buffer libraries into /usr/local (as described above).
+
+    To make cmake use the manually installed version of the protocol buffer
+    libraries, install cmake-curses-gui and use ccmake as follows.
+
+    From within libphonenumber/cpp/build:
+    $ ccmake .
+
+    You should set the following values:
+      PROTOBUF_INCLUDE_DIR         /usr/local/include
+      PROTOBUF_LIB                 /usr/local/lib/libprotobuf.so
+      PROTOC_BIN                   /usr/local/bin/protoc
+
+    Now press 'c' then 'g' to configure the new parameters and exit ccmake.
+    Finally regenerate the make files and rebuild via:
+    $ cmake ..
+    $ make
+
+  - Protoc binary not executing properly
+    If you still have issues with the protoc binary tool in /usr/local/bin not
+    running correctly (cannot find libprotobuf.so.x) then you may need to
+    configure the LD_LIBRARY_PATH. To do this, as a superuser, add the following
+    file:
+      /etc/ld.so.conf.d/libprotobuf.conf
+
+    with the contents:
+      # Use the manually installed version of the protocol buffer libraries.
+      /usr/local/lib
+
+    And then run:
+      $ sudo chmod 644 /etc/ld.so.conf.d/libprotobuf.conf
+      $ sudo ldconfig
+
+  - Incorrect ICU library issues
+    Similar to the protocol buffer library issue above, it is possible that your
+    build may fail if you have two conflicting versions of the ICU libraries
+    installed on your system. This step assumes that you have already manually
+    downloaded and installed a recent version of the ICU libraries into
+    /usr/local.
+
+    Install and run the ccmake tool (as described above) and set the following
+    values:
+      ICU_I18N_INCLUDE_DIR         /usr/local/include
+      ICU_I18N_LIB                 /usr/local/lib/libicui18n.so
+      ICU_UC_INCLUDE_DIR           /usr/local/include
+      ICU_UC_LIB                   /usr/local/lib/libicuuc.so
+
+    Now press 'c' then 'g' to configure the new parameters and exit ccmake.
+    Finally regenerate the make files and rebuild via:
+    $ cmake ..
+    $ make
 
 Building the library on Windows (Visual Studio)
 -----------------------------------------------

File cpp/src/phonenumbers/geocoding/mapping_file_provider.cc

+// Copyright (C) 2012 The Libphonenumber Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+//
+// Author: Patrick Mezard
+
+#include "phonenumbers/geocoding/mapping_file_provider.h"
+
+#include <algorithm>
+#include <cstddef>
+#include <cstring>
+#include <sstream>
+#include <string>
+
+#include "phonenumbers/geocoding/geocoding_data.h"
+
+namespace i18n {
+namespace phonenumbers {
+
+using std::string;
+
+namespace {
+
+struct NormalizedLocale {
+  const char* locale;
+  const char* normalized_locale;
+};
+
+const NormalizedLocale kNormalizedLocales[] = {
+  {"zh_TW", "zh_Hant"},
+  {"zh_HK", "zh_Hant"},
+  {"zh_MO", "zh_Hant"},
+};
+
+const char* GetNormalizedLocale(const string& full_locale) {
+  const int size = sizeof(kNormalizedLocales) / sizeof(*kNormalizedLocales);
+  for (int i = 0; i != size; ++i) {
+    if (full_locale.compare(kNormalizedLocales[i].locale) == 0) {
+      return kNormalizedLocales[i].normalized_locale;
+    }
+  }
+  return NULL;
+}
+
+void AppendLocalePart(const string& part, string* full_locale) {
+  if (!part.empty()) {
+    full_locale->append("_");
+    full_locale->append(part);
+  }
+}
+
+void ConstructFullLocale(const string& language, const string& script, const
+                         string& region, string* full_locale) {
+  full_locale->assign(language);
+  AppendLocalePart(script, full_locale);
+  AppendLocalePart(region, full_locale);
+}
+
+// Returns true if s1 comes strictly before s2 in lexicographic order.
+bool IsLowerThan(const char* s1, const char* s2) {
+  return strcmp(s1, s2) < 0;
+}
+
+// Returns true if languages contains language.
+bool HasLanguage(const CountryLanguages* languages, const string& language) {
+  const char** const start = languages->available_languages;
+  const char** const end = start + languages->available_languages_size;
+  const char** const it =
+      std::lower_bound(start, end, language.c_str(), IsLowerThan);
+  return it != end && strcmp(language.c_str(), *it) == 0;
+}
+
+}  // namespace
+
+MappingFileProvider::MappingFileProvider(
+    const int* country_calling_codes, int country_calling_codes_size,
+    country_languages_getter get_country_languages)
+  : country_calling_codes_(country_calling_codes),
+    country_calling_codes_size_(country_calling_codes_size),
+    get_country_languages_(get_country_languages) {
+}
+
+const string& MappingFileProvider::GetFileName(int country_calling_code,
+                                               const string& language,
+                                               const string& script,
+                                               const string& region,
+                                               string* filename) const {
+  filename->clear();
+  if (language.empty()) {
+    return *filename;
+  }
+  const int* const country_calling_codes_end = country_calling_codes_ +
+      country_calling_codes_size_;
+  const int* const it =
+      std::lower_bound(country_calling_codes_,
+                       country_calling_codes_end,
+                       country_calling_code);
+  if (it == country_calling_codes_end || *it != country_calling_code) {
+    return *filename;
+  }
+  const CountryLanguages* const langs =
+      get_country_languages_(it - country_calling_codes_);
+  if (langs->available_languages_size > 0) {
+    string language_code;
+    FindBestMatchingLanguageCode(langs, language, script, region,
+                                 &language_code);
+  if (!language_code.empty()) {
+    std::stringstream filename_buf;
+    filename_buf << country_calling_code << "_" << language_code;
+    *filename = filename_buf.str();
+    }
+  }
+  return *filename;
+}
+
+void MappingFileProvider::FindBestMatchingLanguageCode(
+  const CountryLanguages* languages, const string& language,
+  const string& script, const string& region, string* best_match) const {
+  string full_locale;
+  ConstructFullLocale(language, script, region, &full_locale);
+  const char* const normalized_locale = GetNormalizedLocale(full_locale);
+  if (normalized_locale != NULL) {
+    string normalized_locale_str(normalized_locale);
+    if (HasLanguage(languages, normalized_locale_str)) {
+      best_match->swap(normalized_locale_str);
+      return;
+    }
+  }
+
+  if (HasLanguage(languages, full_locale)) {
+    best_match->swap(full_locale);
+    return;
+  }
+
+  if (script.empty() != region.empty()) {
+    if (HasLanguage(languages, language)) {
+      *best_match = language;
+      return;
+    }
+  } else if (!script.empty() && !region.empty()) {
+    string lang_with_script(language);
+    lang_with_script.append("_");
+    lang_with_script.append(script);
+    if (HasLanguage(languages, lang_with_script)) {
+      best_match->swap(lang_with_script);
+      return;
+    }
+  }
+
+  string lang_with_region(language);
+  lang_with_region.append("_");
+  lang_with_region.append(region);
+  if (HasLanguage(languages, lang_with_region)) {
+    best_match->swap(lang_with_region);
+    return;
+  }
+  if (HasLanguage(languages, language)) {
+    *best_match = language;
+    return;
+  }
+  best_match->clear();
+}
+
+}  // namespace phonenumbers
+}  // namespace i18n

File cpp/src/phonenumbers/geocoding/mapping_file_provider.h

+// Copyright (C) 2012 The Libphonenumber Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Author: Patrick Mezard
+
+#ifndef I18N_PHONENUMBERS_GEOCODING_MAPPING_FILE_PROVIDER_H_
+#define I18N_PHONENUMBERS_GEOCODING_MAPPING_FILE_PROVIDER_H_
+
+#include <string>
+
+#include "base/basictypes.h"
+
+namespace i18n {
+namespace phonenumbers {
+
+using std::string;
+
+struct CountryLanguages;
+
+// A utility which knows the data files that are available for the geocoder to
+// use. The data files contain mappings from phone number prefixes to text
+// descriptions, and are organized by country calling code and language that the
+// text descriptions are in.
+class MappingFileProvider {
+ public:
+  typedef const CountryLanguages* (*country_languages_getter)(int index);
+
+  // Initializes a MappingFileProvider with country_calling_codes, a sorted
+  // list of country_calling_code_size calling codes, and a function
+  // get_country_languages(int index) returning the CountryLanguage information
+  // related to the country code at index in country_calling_codes.
+  MappingFileProvider(const int* country_calling_codes,
+                      int country_calling_code_size,
+                      country_languages_getter get_country_languages);
+
+  // Returns the name of the file that contains the mapping data for the
+  // country_calling_code in the language specified, or an empty string if no
+  // such file can be found. language is a two-letter lowercase ISO language
+  // codes as defined by ISO 639-1. script is a four-letter titlecase (the first
+  // letter is uppercase and the rest of the letters are lowercase) ISO script
+  // codes as defined in ISO 15924. region is a two-letter uppercase ISO country
+  // codes as defined by ISO 3166-1.
+  const string& GetFileName(int country_calling_code, const string& language,
+                            const string& script, const string& region, string*
+                            filename) const;
+
+ private:
+  void FindBestMatchingLanguageCode(const CountryLanguages* languages,
+                                    const string& language,
+                                    const string& script,
+                                    const string& region,
+                                    string* best_match) const;
+
+  const int* const country_calling_codes_;
+  const int country_calling_codes_size_;
+  const country_languages_getter get_country_languages_;
+
+  DISALLOW_COPY_AND_ASSIGN(MappingFileProvider);
+};
+
+}  // namespace phonenumbers
+}  // namespace i18n
+
+#endif  // I18N_PHONENUMBERS_GEOCODING_MAPPING_FILE_PROVIDER_H_