More verbose, one dependency, but guarantees consistent output for most inputs and was fun to write:
import re
def format_tel(tel):
tel = tel.removeprefix("+")
tel = tel.removeprefix("1") # remove leading +1 or 1
tel = re.sub("[ ()-]", '', tel) # remove space, (), -
assert(len(tel) == 10)
tel = f"{tel[:3]}-{tel[3:6]}-{tel[6:]}"
return tel
Output:
>>> format_tel("1-800-628-8737")
'800-628-8737'
>>> format_tel("800-628-8737")
'800-628-8737'
>>> format_tel("18006288737")
'800-628-8737'
>>> format_tel("1800-628-8737")
'800-628-8737'
>>> format_tel("(800) 628-8737")
'800-628-8737'
>>> format_tel("(800) 6288737")
'800-628-8737'
>>> format_tel("(800)6288737")
'800-628-8737'
>>> format_tel("8006288737")
'800-628-8737'
Without magic numbers; …if you’re not into the whole brevity thing:
def format_tel(tel):
AREA_BOUNDARY = 3 # 800.6288737
SUBSCRIBER_SPLIT = 6 # 800628.8737
tel = tel.removeprefix("+")
tel = tel.removeprefix("1") # remove leading +1, or 1
tel = re.sub("[ ()-]", '', tel) # remove space, (), -
assert(len(tel) == 10)
tel = (f"{tel[:AREA_BOUNDARY]}-"
f"{tel[AREA_BOUNDARY:SUBSCRIBER_SPLIT]}-{tel[SUBSCRIBER_SPLIT:]}")
return tel
More verbose, one dependency, but guarantees consistent output for most inputs and was fun to write:
import re
def format_tel(tel):
tel = tel.removeprefix("+")
tel = tel.removeprefix("1") # remove leading +1 or 1
tel = re.sub("[ ()-]", '', tel) # remove space, (), -
assert(len(tel) == 10)
tel = f"{tel[:3]}-{tel[3:6]}-{tel[6:]}"
return tel
Output:
>>> format_tel("1-800-628-8737")
'800-628-8737'
>>> format_tel("800-628-8737")
'800-628-8737'
>>> format_tel("18006288737")
'800-628-8737'
>>> format_tel("1800-628-8737")
'800-628-8737'
>>> format_tel("(800) 628-8737")
'800-628-8737'
>>> format_tel("(800) 6288737")
'800-628-8737'
>>> format_tel("(800)6288737")
'800-628-8737'
>>> format_tel("8006288737")
'800-628-8737'
Without magic numbers; …if you’re not into the whole brevity thing:
def format_tel(tel):
AREA_BOUNDARY = 3 # 800.6288737
SUBSCRIBER_SPLIT = 6 # 800628.8737
tel = tel.removeprefix("+")
tel = tel.removeprefix("1") # remove leading +1, or 1
tel = re.sub("[ ()-]", '', tel) # remove space, (), -
assert(len(tel) == 10)
tel = (f"{tel[:AREA_BOUNDARY]}-"
f"{tel[AREA_BOUNDARY:SUBSCRIBER_SPLIT]}-{tel[SUBSCRIBER_SPLIT:]}")
return tel
Introduction
Validating phone numbers can be a very challenging task. The format of a phone number can vary from one country to another. Heck, it can also vary within the same country! Some countries share the same country code, while some other countries use more than one country code. According to an example from the Google’s libphonenumber
GitHub repository, USA, Canada, and Caribbean islands, all share the same country code (+1
). On the other hand, it is possible to call the phone numbers from Kosovo by Serbian, Slovenian and Moroccan country codes.
These are only a few of the challenges in identifying or validating phone numbers. At first glance, one can at least validate the country code of a phone number with a RegEx. However, this means that you would have to write a custom RegEx rule for every country in the world, just to validate a country code. On top of that, some mobile phone carriers have their own rules (for example, certain digits can only use a certain range of numbers). You can see that things can quickly get out of hand and make it almost impossible for us to validate phone number inputs by ourselves.
Luckily, there is a Python library that can help us to get through the validation process easily and efficiently. The Python Phonenumbers library is derived from Google’s libphonenumber
library, which is also available for other programming languages like C++, Java, and JavaScript.
In this tutorial, we’ll learn how to parse, validate and extract phone numbers, as well as how to extract additional information from the phone number(s) like the carrier, timezone, or geocoder details.
Using the library is very straight-forward and it’s typically used like this:
import phonenumbers
from phonenumbers import carrier, timezone, geocoder
my_number = phonenumbers.parse("+447986123456", "GB")
print(phonenumbers.is_valid_number(my_number))
print(carrier.name_for_number(my_number, "en"))
print(timezone.time_zones_for_number(my_number))
print(geocoder.description_for_number(my_number, 'en'))
And here’s the output:
True
EE
('Europe/Guernsey', 'Europe/Isle_of_Man', 'Europe/Jersey', 'Europe/London')
United Kingdom
Let’s get started by setting up our environment and installing the library.
Installing phonenumbers
First, let’s create and activate our virtual environment:
$ mkdir phonenumbers && cd phonenumbers
$ python3 -m venv venv
$ . venv/bin/active # venvScriptsactivate.bat on Windows
Then we install the Python Phonenumbers library:
$ pip3 install Phonenumbers
This tutorial will use Phonenumbers library version of 8.12.19
.
Now we are ready to start discovering the Phonenumbers library.
Parse Phone Numbers with Python phonenumbers
Whether you get user input from a web form or other sources, like extracting from some text (more on that later in this tutorial), the input phone number will most likely be a string. As a first step, we’ll need to parse it using phonenumbers
, and turn it into a PhoneNumber
instance so that we can use it for validation and other functionalities.
We can parse the phone number using the parse()
method:
import phonenumbers
my_string_number = "+40721234567"
my_number = phonenumbers.parse(my_string_number)
The phonenumbers.parse()
method takes a phone number string as a required argument. You can also pass the country information in ISO Alpha-2 format as an optional argument. Take, for example, the following code into consideration:
my_number = phonenumbers.parse(my_string_number, "RO")
«RO» stands for Romania in ISO Alpha-2 format. You can check other Alpha-2 and numeric country codes from this website. In this tutorial, for simplicity, I will omit the ISO Alpha-2 country code for most cases and include it only when it’s strictly necessary.
The phonenumbers.parse()
method already has some built-in basic validation rules like the length of a number string, or checking a leading zero, or for a +
sign. Note that this method will throw an exception when any of the needed rules are not fulfilled. So remember to use it in a try/catch block in your application.
Now that we got our phone number parsed correctly, let’s proceed to validation.
Validate Phone Numbers with Python Phonenumbers
Phonenumbers has two methods to check the validity of a phone number. The main difference between these methods is the speed and accuracy.
To elaborate, let’s start with is_possible_number()
:
import phonenumbers
my_string_number = "+40021234567"
my_number = phonenumbers.parse(my_string_number)
print(phonenumbers.is_possible_number(my_number))
And the output would be:
True
Now let’s use the same number, but with the is_valid_number()
method this time:
import phonenumbers
my_string_number = "+40021234567"
my_number = phonenumbers.parse(my_string_number)
print(phonenumbers.is_valid_number(my_number))
Even though the input was the same, the result would be different:
False
The reason is that the is_possible_number()
method makes a quick guess on the phone number’s validity by checking the length of the parsed number, while the is_valid_number()
method runs a full validation by checking the length, phone number prefix, and region.
When iterating over a large list of phone numbers, using phonenumbers.is_possible_number()
would provide faster results comparing to the phonenumbers.is_valid_number()
. But as we see here, these results may not always be accurate. It can be useful to quickly eliminate phone numbers that do not comply with the length. So use it at your own risk.
Extract and Format Phone Numbers with Python Phonenumbers
User input is not the only way to get or collect phone numbers. For instance, you may have a spider/crawler that would read certain pages from a website or a document and would extract the phone numbers from the text blocks. It sounds like a challenging problem but luckily, the Phonenumbers library provides us just the functionality we need, with the PhoneNumberMatcher(text, region)
method.
PhoneNumberMatcher
takes a text block and a region as an argument then iterates over to return the matching results as PhoneNumberMatch
objects.
Let’s use PhoneNumberMatcher
with a random text:
import phonenumbers
text_block = "Our services will cost about 2,200 USD and we will deliver the product by the 10.10.2021. For more information, you can call us at +44 7986 123456 or send an e-mail to [email protected]"
for match in phonenumbers.PhoneNumberMatcher(text_block, "GB"):
print(match)
This will print the matching phone numbers along with their index in the string:
PhoneNumberMatch [131,146) +44 7986 123456
You may have noticed that our number is formatted in the standardized international format and divided by the spaces. This may not always be the case in real-life scenarios. You may receive your number in other formats, like divided by dashes or formatted to the national (instead of the international) format.
Let’s put the PhoneNumberMatcher()
method to the test with other phone number formats:
import phonenumbers
text_block = "Our services will cost about 2,200 USD and we will deliver the product by the 10.10.2021. For more information you can call us at +44-7986-123456 or 020 8366 1177 send an e-mail to [email protected]"
for match in phonenumbers.PhoneNumberMatcher(text_block, "GB"):
print(match)
This would output:
PhoneNumberMatch [130,145) +44-7986-123456
PhoneNumberMatch [149,162) 020 8366 1177
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Even though the phone numbers are embedded deep into the text with a variety of formats with other numbers, PhoneNumberMatcher
successfully returns the phone numbers with great accuracy.
Apart from extracting data from the text, we might also want to get the digits one by one from the user. Imagine that your app’s UI works similar to modern mobile phones, and formats the phone numbers as you type in. For instance, on your web page, you might want to pass the data to your API with each onkeyup
event and use AsYouTypeFormatter()
to format the phone number with each incoming digit.
Since UI part is out of the scope of this article, we’ll use a basic example for AsYouTypeFormatter
. To simulate on-the-fly formatting, let’s jump into the Python interpreter:
>>> import phonenumbers
>>> formatter = phonenumbers.AsYouTypeFormatter("TR")
>>> formatter.input_digit("3")
'3'
>>> formatter.input_digit("9")
'39'
>>> formatter.input_digit("2")
'392'
>>> formatter.input_digit("2")
'392 2'
>>> formatter.input_digit("2")
'392 22'
>>> formatter.input_digit("1")
'392 221'
>>> formatter.input_digit("2")
'392 221 2'
>>> formatter.input_digit("3")
'392 221 23'
>>> formatter.input_digit("4")
'392 221 23 4'
>>> formatter.input_digit("5")
'392 221 23 45'
Not all user input happens as they type. Some forms have simple text input fields for phone numbers. However, that doesn’t necessarily mean that we’ll have data entered in a standard format.
The Phonenumbers library got us covered here too with the format_number()
method. This method allows us to format phone numbers into three well-known, standardized formats. National, International, and E164. National and International formats are pretty self-explanatory, while the E164 format is an international phone number format that ensures phone numbers are limited with 15 digits and are formatted {+}{country code}{number with area code}. For more information on E164, you can check this Wikipedia page.
Let’s start with the national formatting:
import phonenumbers
my_number = phonenumbers.parse("+40721234567")
national_f = phonenumbers.format_number(my_number, phonenumbers.PhoneNumberFormat.NATIONAL)
print(national_f)
This will return a nicely spaced phone number string with the national format:
0721 234 567
Now let’s try to format the national number as in international format:
import phonenumbers
my_number = phonenumbers.parse("0721234567", "RO") # "RO" is ISO Alpha-2 code for Romania
international_f = phonenumbers.format_number(my_number, phonenumbers.PhoneNumberFormat.INTERNATIONAL)
print(international_f)
The above code will return a nicely spaced phone number string:
+40 721 234 567
Notice that we passed "RO"
as the second parameter into the parse()
method. Since the input number is a national number, it has no country code prefix to hint at the country. In these cases, we need to specify the country with its ISO Alpha-2 code to get an accurate result. Excluding either the numeric and ISO Alpha-2 country codes, will cause an exception of NumberParseException: (0) Missing or invalid default region.
.
Now let’s try the E164
formatting option. We’ll pass a national string as the input:
import phonenumbers
my_number = phonenumbers.parse("0721234567", "RO")
e164_f=phonenumbers.format_number(my_number, phonenumbers.PhoneNumberFormat.E164)
print(e164_f)
The output will be very similar to the PhoneNumberFormat.INTERNATIONAL
, except with the spaces:
+40721234567
This is very useful when you want to pass the number to a background API. It isn’t uncommon for APIs to expect phone numbers to be non-spaced strings.
Get Additional Information on Phone Number
A phone number is loaded with data about a user that could be of interest to you. You may want to use different APIs or API endpoints depending on the carrier of the particular phone number since this plays a role in the product cost. You might want to send your promotion notifications depending on your customer’s (phone number’s) timezone so that you don’t send them a message in the middle of the night. Or you might want to get information about the phone number’s location so that you can provide relevant information. The Phonenumbers library provides the necessary tools to fulfill these needs.
To start with the location, we will use the description_for_number()
method from the geocoder
class. This method takes in a parsed phone number and a short language name as parameters.
Let’s try this with our previous fake number:
import phonenumbers
from phonenumbers import geocoder
my_number = phonenumbers.parse("+447986123456")
print(geocoder.description_for_number(my_number, "en"))
This will print out the origin country of the phone number:
United Kingdom
Short language names are pretty intuitive. Let’s try to get output in Russian:
import phonenumbers
from phonenumbers import geocoder
my_number = phonenumbers.parse("+447986123456")
print(geocoder.description_for_number(my_number, "ru"))
And here’s the output which says the United Kingdom in Russian:
Соединенное Королевство
You can try it out with other languages of your preferences like «de», «fr», «zh», etc.
As mentioned before, you might want to group your phone numbers by their carriers, since in most cases it will have an impact on the cost. To clarify, the Phonenumbers library probably will provide most of the carrier names accurately, but not 100%.
Today in most countries it is possible to get your number from one carrier and later on move the same number to a different carrier, leaving the phone number exactly the same. Since Phonenumbers is merely an offline Python library, it is not possible to detect these changes. So it’s best to approach the carrier names as a reference, rather than a fact.
We will use the name_for_number()
method from carrier
class:
import phonenumbers
from phonenumbers import carrier
my_number = phonenumbers.parse("+40721234567")
print(carrier.name_for_number(my_number, "en"))
This will display the original carrier of the phone number if possible:
Vodafone
Note: As it is mentioned in the original documents of the Python Phonenumbers, carrier information is available for mobile numbers in some countries, not all.
Another important piece of information about a phone number is its timezone. The time_zones_for_number()
method will return a list of timezones that the number belongs to. We’ll import it from phonenumbers.timezone
:
import phonenumbers
from phonenumbers import timezone
my_number = phonenumbers.parse("+447986123456")
print(timezone.time_zones_for_number(my_number))
This will print the following timezones:
('Europe/Guernsey', 'Europe/Isle_of_Man', 'Europe/Jersey', 'Europe/London')
This concludes our tutorial on Python Phonenumbers.
Conclusion
We learned how to parse phone numbers with parse()
method, extract numbers from text blocks with PhoneNumberMatcher()
, get the phone numbers digit by digit and format it with AsYouTypeFormatter()
, use different validation methods with is_possible_number()
and is_possible_number()
, format numbers using NATIONAL
, INTERNATIONAL
, and E164
formatting methods, and extract additional information from the phone numbers using geocoder
, carrier
, and timezone
classes.
Remember to check out the original GitHub repo of the Phonenumbers library. Also if you have any questions in mind, feel free to comment below.
Форматирование номера телефона
17.11.2019, 12:56. Показов 4251. Ответов 2
Есть задача
«Привет, если ты знаешь номер может быть написан по-разному, т. 8915 12 — 34 — 567 вот так, например номер + 7(915)123 — 45-67 может писаться, и так + 7978123456779787654321 где 79787654321 тоже номер, но какой-то рукооп написал их слитно, и теперь надо разделять… и нужно систему научить разделять их 12345 просто цифры, их учитывать не стоит, а тебе нужен номер где точно встречается 8 или 7 потом код потом 7 циферок номера «
достать из строки все номера телефонов
0_о
Добавлено через 16 минут
Вот так
Python | ||
|
[‘89151234567’, ‘79151234567’, ‘7978123456779787654321’, ‘79787654321’, ‘12345’, ‘8’, ‘7’, ‘7’]
Добавлено через 13 минут
теперь их надо отсортировать
и «длинные» разделить
Добавлено через 7 минут
Короче )
получилось =))
Python | ||
|
Добавлено через 4 минуты
Может можно это как-то сократить?
Добавлено через 15 секунд
Может можно это как-то сократить?
0
Кодинг-марафон. Задача 6.
В БД некоего предприятия номера телефонов хранятся как попало и вам поручили написать функцию, которая приведёт их все к одному формату.
Функция должна называться format_numbers
, она принимает на вход строку (исходный номер) и возвращает строку (номер в нужном формате).
Сигнатура def format_numbers(phone_number: str) -> str:
Особенности номеров в БД:
- помимо цифр может содержать дефисы
-
, пробелы, скобки и знак+
(но+
только первым символом) - номер всегда валиден (содержит 11 цифр)
- номер всегда начинается с 8 или +7, причем
+
может быть только в начале номера - не может быть 2 идущих подряд дефисов, скобок или пробелов
Необходимо, чтобы на выходе любой номер стал такого формата: +7(909)101-10-10
Примеры
format_numbers('+79091011010') == '+7(909)101-10-10' format_numbers('8(909)1011010') == '+7(909)101-10-10' format_numbers('+7 909 101-10-10') == '+7(909)101-10-10'
Варианты решений
def format_numbers(phone_number: str) -> str: return '+7({0}{1}{2}){3}{4}{5}-{6}{7}-{8}{9}'.format(*[i for i in phone_number if i.isdigit()][1:])
def format_numbers(phone_number: str) -> str: numbers = list(filter(str.isdigit, phone_number))[1:] return "+7({}{}{}){}{}{}-{}{}-{}{}".format(*numbers)
phonenumbers Python Library
This is a Python port of Google’s libphonenumber library
It supports Python 2.5-2.7 and Python 3.x (in the same codebase, with no
2to3 conversion needed).
Original Java code is Copyright (C) 2009-2015 The Libphonenumber Authors.
Release HISTORY,
derived from upstream release notes.
Installation
Install using pip with:
pip install phonenumbers
Example Usage
The main object that the library deals with is a PhoneNumber
object. You can create this from a string
representing a phone number using the parse
function, but you also need to specify the country
that the phone number is being dialled from (unless the number is in E.164 format, which is globally
unique).
>>> import phonenumbers >>> x = phonenumbers.parse("+442083661177", None) >>> print(x) Country Code: 44 National Number: 2083661177 Leading Zero: False >>> type(x) <class 'phonenumbers.phonenumber.PhoneNumber'> >>> y = phonenumbers.parse("020 8366 1177", "GB") >>> print(y) Country Code: 44 National Number: 2083661177 Leading Zero: False >>> x == y True >>> z = phonenumbers.parse("00 1 650 253 2222", "GB") # as dialled from GB, not a GB number >>> print(z) Country Code: 1 National Number: 6502532222 Leading Zero(s): False
The PhoneNumber
object that parse
produces typically still needs to be validated, to check whether
it’s a possible number (e.g. it has the right number of digits) or a valid number (e.g. it’s
in an assigned exchange).
>>> z = phonenumbers.parse("+120012301", None) >>> print(z) Country Code: 1 National Number: 20012301 Leading Zero: False >>> phonenumbers.is_possible_number(z) # too few digits for USA False >>> phonenumbers.is_valid_number(z) False >>> z = phonenumbers.parse("+12001230101", None) >>> print(z) Country Code: 1 National Number: 2001230101 Leading Zero: False >>> phonenumbers.is_possible_number(z) True >>> phonenumbers.is_valid_number(z) # NPA 200 not used False
The parse
function will also fail completely (with a NumberParseException
) on inputs that cannot
be uniquely parsed, or that can’t possibly be phone numbers.
>>> z = phonenumbers.parse("02081234567", None) # no region, no + => unparseable Traceback (most recent call last): File "phonenumbers/phonenumberutil.py", line 2350, in parse "Missing or invalid default region.") phonenumbers.phonenumberutil.NumberParseException: (0) Missing or invalid default region. >>> z = phonenumbers.parse("gibberish", None) Traceback (most recent call last): File "phonenumbers/phonenumberutil.py", line 2344, in parse "The string supplied did not seem to be a phone number.") phonenumbers.phonenumberutil.NumberParseException: (1) The string supplied did not seem to be a phone number.
Once you’ve got a phone number, a common task is to format it in a standardized format. There are a few
formats available (under PhoneNumberFormat
), and the format_number
function does the formatting.
>>> phonenumbers.format_number(x, phonenumbers.PhoneNumberFormat.NATIONAL) '020 8366 1177' >>> phonenumbers.format_number(x, phonenumbers.PhoneNumberFormat.INTERNATIONAL) '+44 20 8366 1177' >>> phonenumbers.format_number(x, phonenumbers.PhoneNumberFormat.E164) '+442083661177'
If your application has a UI that allows the user to type in a phone number, it’s nice to get the formatting
applied as the user types. The AsYouTypeFormatter
object allows this.
>>> formatter = phonenumbers.AsYouTypeFormatter("US") >>> formatter.input_digit("6") '6' >>> formatter.input_digit("5") '65' >>> formatter.input_digit("0") '650' >>> formatter.input_digit("2") '650 2' >>> formatter.input_digit("5") '650 25' >>> formatter.input_digit("3") '650 253' >>> formatter.input_digit("2") '650-2532' >>> formatter.input_digit("2") '(650) 253-22' >>> formatter.input_digit("2") '(650) 253-222' >>> formatter.input_digit("2") '(650) 253-2222'
Sometimes, you’ve got a larger block of text that may or may not have some phone numbers inside it. For this,
the PhoneNumberMatcher
object provides the relevant functionality; you can iterate over it to retrieve a
sequence of PhoneNumberMatch
objects. Each of these match objects holds a PhoneNumber
object together
with information about where the match occurred in the original string.
>>> text = "Call me at 510-748-8230 if it's before 9:30, or on 703-4800500 after 10am." >>> for match in phonenumbers.PhoneNumberMatcher(text, "US"): ... print(match) ... PhoneNumberMatch [11,23) 510-748-8230 PhoneNumberMatch [51,62) 703-4800500 >>> for match in phonenumbers.PhoneNumberMatcher(text, "US"): ... print(phonenumbers.format_number(match.number, phonenumbers.PhoneNumberFormat.E164)) ... +15107488230 +17034800500
You might want to get some information about the location that corresponds to a phone number. The
geocoder.area_description_for_number
does this, when possible.
>>> from phonenumbers import geocoder >>> ch_number = phonenumbers.parse("0431234567", "CH") >>> geocoder.description_for_number(ch_number, "de") 'Zürich' >>> geocoder.description_for_number(ch_number, "en") 'Zurich' >>> geocoder.description_for_number(ch_number, "fr") 'Zurich' >>> geocoder.description_for_number(ch_number, "it") 'Zurigo'
For mobile numbers in some countries, you can also find out information about which carrier
originally owned a phone number.
>>> from phonenumbers import carrier >>> ro_number = phonenumbers.parse("+40721234567", "RO") >>> carrier.name_for_number(ro_number, "en") 'Vodafone'
You might also be able to retrieve a list of time zone names that the number potentially
belongs to.
>>> from phonenumbers import timezone >>> gb_number = phonenumbers.parse("+447986123456", "GB") >>> timezone.time_zones_for_number(gb_number) ('Atlantic/Reykjavik', 'Europe/London')
For more information about the other functionality available from the library, look in the unit tests or in the original
libphonenumber project.
Memory Usage
The library includes a lot of metadata, potentially giving a significant memory overhead. There are two mechanisms
for dealing with this.
- The normal metadata (just over 2 MiB of generated Python code) for the core functionality of the library is loaded
on-demand, on a region-by-region basis (i.e. the metadata for a region is only loaded on the first time it is needed). - Metadata for extended functionality is held in separate packages, which therefore need to be explicitly
loaded separately. This affects:- The geocoding metadata (~19 MiB), which is held in
phonenumbers.geocoder
and used by the geocoding functions
(geocoder.description_for_number
,geocoder.description_for_valid_number
or
geocoder.country_name_for_number
). - The carrier metadata (~1 MiB), which is held in
phonenumbers.carrier
and used by the mapping functions
(carrier.name_for_number
orcarrier.name_for_valid_number
). - The timezone metadata (~100 KiB), which is held in
phonenumbers.timezone
and used by the timezone functions
(time_zones_for_number
ortime_zones_for_geographical_number
).
- The geocoding metadata (~19 MiB), which is held in
The phonenumberslite
version of the library does not include the geocoder, carrier and timezone packages,
which can be useful if you have problems installing the main phonenumbers
library due to space/memory limitations.
If you need to ensure that the metadata memory use is accounted for at start of day (i.e. that a subsequent on-demand
load of metadata will not cause a pause or memory exhaustion):
- Force-load the normal metadata by calling
phonenumbers.PhoneMetadata.load_all()
. - Force-load the extended metadata by
import
ing the appropriate packages (phonenumbers.geocoder
,
phonenumbers.carrier
,phonenumbers.timezone
).
The phonenumberslite
version of the package does not include the geocoding, carrier and timezone metadata,
which can be useful if you have problems installing the main phonenumbers
package due to space/memory limitations.
Static Typing
The library includes a set of type stub files to support static
type checking by library users. These stub files signal the types that should be used, and may also be of use in IDEs
which have integrated type checking functionalities.
These files are written for Python 3, and as such type checking the library with these stubs on Python 2.5-2.7 is
unsupported.
Project Layout
- The
python/
directory holds the Python code. - The
resources/
directory is a copy of theresources/
directory from
libphonenumber.
This is not needed to run the Python code, but is needed when upstream
changes to the master metadata need to be incorporated. - The
tools/
directory holds the tools that are used to process upstream
changes to the master metadata.
4 ответа
для библиотеки: phonenumbers (pypi, source)
Python версия общей библиотеки Google для разбора, форматирования, хранения и проверки международных телефонных номеров.
Чтение недостаточно, но я нашел, что код хорошо документирован.
kusut
14 авг. 2011, в 18:56
Поделиться
Похоже, что ваши примеры, отформатированные с тремя цифрами, за исключением последнего, вы можете написать простую функцию, используя тысячу секторов и добавив последнюю цифру:
>>> def phone_format(n):
... return format(int(n[:-1]), ",").replace(",", "-") + n[-1]
...
>>> phone_format("5555555")
'555-5555'
>>> phone_format("5555555")
'555-5555'
>>> phone_format("5555555555")
'555-555-5555'
>>> phone_format("18005555555")
'1-800-555-5555'
utdemir
14 авг. 2011, в 17:35
Поделиться
Здесь один адаптирован из решения utdemir и это решение, которое будет работать с Python 2.6, «formatter является новым в Python 2.7.
def phone_format(phone_number):
clean_phone_number = re.sub('[^0-9]+', '', phone_number)
formatted_phone_number = re.sub("(d)(?=(d{3})+(?!d))", r"1-", "%d" % int(clean_phone_number[:-1])) + clean_phone_number[-1]
return formatted_phone_number
Jonathan Mabe
17 окт. 2014, в 01:26
Поделиться
Простое решение может состоять в том, чтобы начать с обратной стороны и вставить дефис после четырех чисел, а затем сделать группы по три, пока не будет достигнуто начало строки. Я не знаю о встроенной функции или что-то в этом роде.
Вам может показаться полезным:
http://www.diveintopython3.net/regular-expressions.html#phonenumbers
Регулярные выражения будут полезны, если вы принимаете пользовательский ввод телефонных номеров. Я бы не использовал точный подход, который следовал по вышеуказанной ссылке. Что-то более простое, как просто удаление цифр, возможно, проще и так же хорошо.
Кроме того, вставка запятых в числа является аналогичной проблемой, которая была эффективно решена в другом месте и может быть адаптирована к этой проблеме.
ChrisP
14 авг. 2011, в 16:55
Поделиться
Ещё вопросы
- 0В чем преимущество использования HEXADECIMAL перед DECIMAL
- 0Можно ли конвертировать из .php в .xml после завершения обработки?
- 0Как создать динамический список выбора из строки, разделенной запятыми?
- 0Возникли проблемы с if и while ((row = mysql_fetch_row (query_results))! = 0)
- 0Завершение неиспользуемых подключений http, которые являются результатом img ng-src
- 1mongodb возвращает размер / количество массивов в простом поисковом запросе без использования механизма агрегирования [duplicate]
- 1Android очень длинный текст: идеи прокрутки?
- 1Использование pandas dataframe для записи значений в новую ошибку csv-файла
- 0Функция как для строк в стиле C, так и для c ++ std :: string
- 0Скрипт самостоятельно делает неверный запрос (JS — TheMovieDb api)
- 1Делать операцию, когда поток умирает
- 1Отформатируйте и замените строку регулярным выражением
- 0Jquery переключить отображение перед анимацией?
- 1SL4A — Сценарий для Android
- 064-битные и 32-битные исполняемые файлы C ++ с code :: blocks
- 0Почему мой модуль директив не включен в мой модуль приложения Angular?
- 1Объединение фиксированной и региональной нотации для форматирования строки
- 1Сортировать строки даты на карте [дубликаты]
- 1Как использовать PyTorch для вычисления частных производных?
- 1Измените размер изображения так, чтобы оно соответствовало сетке JPanel
- 1Получить двоичные данные с внешнего устройства на телефоне Android
- 1Java Внутренний Класс и Расширение
- 0Группировка пользователей по n группам размеров с помощью angularfire
- 0c ++ вектор push_back с указателем на объект
- 1Python — добавление строк между кадрами панды
- 1Android VideoView LinearLayout LayoutParams
- 0как выбрать имя столбца вместе с примененной к нему агрегатной функцией
- 0C ++ стандартная альтернатива itoa () для преобразования int в base 10 char *
- 0Как создать уникальную область видимости каждый раз, когда вызывается директива (AngularJS)
- 0Как кодировать путь в теге <img>
- 0Можно ли включить / требовать файл с частью кода в PHP?
- 0Как добавить годы в формате даты xAxis при экспорте? Также как добавить источник данных при экспорте?
- 1разрешить почтовый адрес к имени контакта
- 1Как круговая ссылка на объект работает в JavaScript?
- 1Преобразовать строку в объект JSON
- 0Как читать большие данные из файлов Excel (14MB +) в PHP?
- 1Как srun (или mpirun) синхронизирует среды выполнения на разных узлах кластера?
- 0Как импортировать файл Excel в MySQL
- 0Как обнаружить разные слои в изображении
- 0Триггер MySQL после вставки, действие JOIN 2 таблицы
- 1Я случайно удалил /usr/lib/python3.6/site-packages/*
- 0mysql dup count и всего
- 0Перенаправление на другую страницу без отправки какого-либо значения в строке запроса
- 0Невозможно перетасовать `ng-show` при нажатии элемента` ng-repeat`
- 1Соответствие значений параметров в jMock при использовании разрешенного счетчика вызовов
- 1Найти максимальное значение (я) определенного ключа
- 1Слайд-панель на клике
- 0Как использовать угловые директивы внутри ui-gmap-window
- 0Как остановить вставку предопределенных HTML-тегов в текстовое поле с помощью JQuery
Improve Article
Save Article
Improve Article
Save Article
Text preprocessing is one of the most important tasks in Natural Language Processing. You may want to extract number from a string. Writing a manual script for such processing task requires a lot of effort and at most times it is prone to errors. Keeping in view the importance of these preprocessing tasks, the concept of Regular Expression have been developed in different programming languages in order to ease these text processing tasks.
To implement Regular Expression, the python re package can be used and to be used it can be easily imported like any other inbuilt python module.
Steps for converting a 10-digit phone number to its corresponding US number format:
- Import the python re package.
- Write a function that takes the phone number to be formatted as argument and processes it.
- Now simply call the function and pass the value.
Example:
Python3
import
re
def
convert_phone_number(phone):
num
=
re.sub(r
'(?<!S)(d{3})-'
, r
'(1) '
, phone)
return
num
print
(convert_phone_number(
"Call geek 321-963-0612"
))
Output:
Call geek (321) 963-0612