>>> py3-tika: Building community/py3-tika 3.1.0-r0 (using abuild 3.16.0_rc4-r0) started Sat, 08 Nov 2025 15:30:05 +0000 >>> py3-tika: Validating /home/buildozer/aports/community/py3-tika/APKBUILD... >>> py3-tika: Analyzing dependencies... >>> py3-tika: Installing for build: build-base py3-requests py3-setuptools py3-gpep517 py3-wheel py3-pytest py3-pytest-benchmark py3-pytest-cov py3-coveralls py3-yaml openjdk21-jre-headless ( 1/66) Installing libbz2 (1.0.8-r6) ( 2/66) Installing libffi (3.5.2-r0) ( 3/66) Installing gdbm (1.26-r0) ( 4/66) Installing xz-libs (5.8.1-r0) ( 5/66) Installing mpdecimal (4.0.1-r0) ( 6/66) Installing libpanelw (6.5_p20251010-r0) ( 7/66) Installing sqlite-libs (3.51.0-r0) ( 8/66) Installing python3 (3.12.12-r0) ( 9/66) Installing python3-pycache-pyc0 (3.12.12-r0) (10/66) Installing pyc (3.12.12-r0) (11/66) Installing py3-certifi-pyc (2025.10.5-r0) (12/66) Installing py3-requests-pyc (2.32.5-r0) (13/66) Installing python3-pyc (3.12.12-r0) (14/66) Installing py3-certifi (2025.10.5-r0) (15/66) Installing py3-charset-normalizer (3.4.4-r0) (16/66) Installing py3-charset-normalizer-pyc (3.4.4-r0) (17/66) Installing py3-idna (3.10-r0) (18/66) Installing py3-idna-pyc (3.10-r0) (19/66) Installing py3-urllib3 (1.26.20-r0) (20/66) Installing py3-urllib3-pyc (1.26.20-r0) (21/66) Installing py3-requests (2.32.5-r0) (22/66) Installing py3-parsing (3.2.3-r0) (23/66) Installing py3-parsing-pyc (3.2.3-r0) (24/66) Installing py3-packaging (25.0-r0) (25/66) Installing py3-packaging-pyc (25.0-r0) (26/66) Installing py3-setuptools (80.9.0-r2) (27/66) Installing py3-setuptools-pyc (80.9.0-r2) (28/66) Installing py3-installer (0.7.0-r2) (29/66) Installing py3-installer-pyc (0.7.0-r2) (30/66) Installing py3-gpep517 (19-r1) (31/66) Installing py3-gpep517-pyc (19-r1) (32/66) Installing py3-wheel (0.46.1-r0) (33/66) Installing py3-wheel-pyc (0.46.1-r0) (34/66) Installing py3-iniconfig (2.3.0-r0) (35/66) Installing py3-iniconfig-pyc (2.3.0-r0) (36/66) Installing py3-pluggy (1.6.0-r0) (37/66) Installing py3-pluggy-pyc (1.6.0-r0) (38/66) Installing py3-py (1.11.0-r4) (39/66) Installing py3-py-pyc (1.11.0-r4) (40/66) Installing py3-pygments (2.19.2-r0) (41/66) Installing py3-pygments-pyc (2.19.2-r0) (42/66) Installing py3-pytest (8.4.2-r1) (43/66) Installing py3-pytest-pyc (8.4.2-r1) (44/66) Installing py3-py-cpuinfo (9.0.0-r4) (45/66) Installing py3-py-cpuinfo-pyc (9.0.0-r4) (46/66) Installing py3-pytest-benchmark (4.0.0-r4) (47/66) Installing py3-pytest-benchmark-pyc (4.0.0-r4) (48/66) Installing py3-coverage (7.11.0-r0) (49/66) Installing py3-coverage-pyc (7.11.0-r0) (50/66) Installing py3-pytest-cov (5.0.0-r1) (51/66) Installing py3-pytest-cov-pyc (5.0.0-r1) (52/66) Installing py3-docopt (0.6.2-r11) (53/66) Installing py3-docopt-pyc (0.6.2-r11) (54/66) Installing py3-coveralls (3.3.1-r1) (55/66) Installing py3-coveralls-pyc (3.3.1-r1) (56/66) Installing yaml (0.2.5-r2) (57/66) Installing py3-yaml (6.0.3-r0) (58/66) Installing py3-yaml-pyc (6.0.3-r0) (59/66) Installing java-common (1.0-r1) (60/66) Installing libtasn1 (4.20.0-r0) (61/66) Installing p11-kit (0.25.5-r2) (62/66) Installing p11-kit-trust (0.25.5-r2) (63/66) Installing ca-certificates (20250911-r0) (64/66) Installing java-cacerts (1.1-r0) (65/66) Installing openjdk21-jre-headless (21.0.9_p10-r0) (66/66) Installing .makedepends-py3-tika (20251108.153020) busybox-1.37.0-r24.trigger: Executing script... java-common-1.0-r1.trigger: Executing script... ca-certificates-20250911-r0.trigger: Executing script... OK: 685 MiB in 171 packages >>> py3-tika: Cleaning up srcdir >>> py3-tika: Cleaning up pkgdir >>> py3-tika: Cleaning up tmpdir >>> py3-tika: Fetching https://distfiles.alpinelinux.org/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz Connecting to distfiles.alpinelinux.org (172.105.82.32:443) wget: server returned error: HTTP/1.1 404 Not Found >>> py3-tika: Fetching py3-tika-3.1.0-gh.tar.gz::https://github.com/chrismattmann/tika-python/archive/refs/tags/3.1.0.tar.gz Connecting to github.com (140.82.121.3:443) Connecting to codeload.github.com (140.82.121.9:443) saving to '/var/cache/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz.part' py3-tika-3.1.0-gh.ta 100% |********************************| 55027 0:00:00 ETA '/var/cache/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz.part' saved /var/cache/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz: OK >>> py3-tika: Fetching https://distfiles.alpinelinux.org/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz /var/cache/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz: OK >>> py3-tika: Unpacking /var/cache/distfiles/v3.23/py3-tika-3.1.0-gh.tar.gz... 2025-11-08 15:30:23,269 gpep517 INFO Building wheel via backend setuptools.build_meta:__legacy__ /home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/tika/__init__.py:20: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. __import__('pkg_resources').declare_namespace(__name__) /usr/lib/python3.12/site-packages/setuptools/_distutils/dist.py:289: UserWarning: Unknown distribution option: 'test_suite' warnings.warn(msg) /usr/lib/python3.12/site-packages/setuptools/dist.py:759: SetuptoolsDeprecationWarning: License classifiers are deprecated. !! ******************************************************************************** Please consider removing the following classifiers in favor of a SPDX license expression: License :: OSI Approved :: Apache Software License See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details. ******************************************************************************** !! self._finalize_license_expression() 2025-11-08 15:30:23,321 root INFO running bdist_wheel 2025-11-08 15:30:23,344 root INFO running build 2025-11-08 15:30:23,344 root INFO running build_py 2025-11-08 15:30:23,349 root INFO creating build/lib/tika 2025-11-08 15:30:23,350 root INFO copying tika/__init__.py -> build/lib/tika 2025-11-08 15:30:23,350 root INFO copying tika/translate.py -> build/lib/tika 2025-11-08 15:30:23,350 root INFO copying tika/parser.py -> build/lib/tika 2025-11-08 15:30:23,350 root INFO copying tika/unpack.py -> build/lib/tika 2025-11-08 15:30:23,351 root INFO copying tika/tika.py -> build/lib/tika 2025-11-08 15:30:23,351 root INFO copying tika/config.py -> build/lib/tika 2025-11-08 15:30:23,351 root INFO copying tika/language.py -> build/lib/tika 2025-11-08 15:30:23,352 root INFO copying tika/detector.py -> build/lib/tika 2025-11-08 15:30:23,352 root INFO copying tika/pdf.py -> build/lib/tika 2025-11-08 15:30:23,352 root INFO creating build/lib/tika/tests 2025-11-08 15:30:23,352 root INFO copying tika/tests/__init__.py -> build/lib/tika/tests 2025-11-08 15:30:23,353 root INFO copying tika/tests/tests_unpack.py -> build/lib/tika/tests 2025-11-08 15:30:23,353 root INFO copying tika/tests/test_from_file_service.py -> build/lib/tika/tests 2025-11-08 15:30:23,353 root INFO copying tika/tests/test_ssl_link.py -> build/lib/tika/tests 2025-11-08 15:30:23,353 root INFO copying tika/tests/utils.py -> build/lib/tika/tests 2025-11-08 15:30:23,354 root INFO copying tika/tests/test_tika.py -> build/lib/tika/tests 2025-11-08 15:30:23,354 root INFO copying tika/tests/memory_benchmark.py -> build/lib/tika/tests 2025-11-08 15:30:23,354 root INFO copying tika/tests/test_benchmark.py -> build/lib/tika/tests 2025-11-08 15:30:23,354 root INFO copying tika/tests/tests_params.py -> build/lib/tika/tests 2025-11-08 15:30:23,355 root INFO running egg_info 2025-11-08 15:30:23,358 root INFO creating tika.egg-info 2025-11-08 15:30:23,358 root INFO writing tika.egg-info/PKG-INFO 2025-11-08 15:30:23,360 root INFO writing dependency_links to tika.egg-info/dependency_links.txt 2025-11-08 15:30:23,360 root INFO writing entry points to tika.egg-info/entry_points.txt 2025-11-08 15:30:23,361 root INFO writing requirements to tika.egg-info/requires.txt 2025-11-08 15:30:23,361 root INFO writing top-level names to tika.egg-info/top_level.txt 2025-11-08 15:30:23,362 root INFO writing manifest file 'tika.egg-info/SOURCES.txt' 2025-11-08 15:30:23,366 root INFO reading manifest file 'tika.egg-info/SOURCES.txt' 2025-11-08 15:30:23,367 root INFO adding license file 'LICENSE.txt' 2025-11-08 15:30:23,368 root INFO writing manifest file 'tika.egg-info/SOURCES.txt' 2025-11-08 15:30:23,379 root INFO installing to build/bdist.linux-aarch64/wheel 2025-11-08 15:30:23,379 root INFO running install 2025-11-08 15:30:23,389 root INFO running install_lib 2025-11-08 15:30:23,395 root INFO creating build/bdist.linux-aarch64/wheel 2025-11-08 15:30:23,395 root INFO creating build/bdist.linux-aarch64/wheel/tika 2025-11-08 15:30:23,395 root INFO copying build/lib/tika/__init__.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,395 root INFO copying build/lib/tika/translate.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,396 root INFO copying build/lib/tika/parser.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,396 root INFO copying build/lib/tika/unpack.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,396 root INFO copying build/lib/tika/tika.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,396 root INFO creating build/bdist.linux-aarch64/wheel/tika/tests 2025-11-08 15:30:23,397 root INFO copying build/lib/tika/tests/__init__.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,397 root INFO copying build/lib/tika/tests/tests_unpack.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,397 root INFO copying build/lib/tika/tests/test_from_file_service.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,397 root INFO copying build/lib/tika/tests/test_ssl_link.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,397 root INFO copying build/lib/tika/tests/utils.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,398 root INFO copying build/lib/tika/tests/test_tika.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,398 root INFO copying build/lib/tika/tests/memory_benchmark.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,398 root INFO copying build/lib/tika/tests/test_benchmark.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,398 root INFO copying build/lib/tika/tests/tests_params.py -> build/bdist.linux-aarch64/wheel/./tika/tests 2025-11-08 15:30:23,398 root INFO copying build/lib/tika/config.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,399 root INFO copying build/lib/tika/language.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,399 root INFO copying build/lib/tika/detector.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,399 root INFO copying build/lib/tika/pdf.py -> build/bdist.linux-aarch64/wheel/./tika 2025-11-08 15:30:23,399 root INFO running install_egg_info 2025-11-08 15:30:23,404 root INFO Copying tika.egg-info to build/bdist.linux-aarch64/wheel/./tika-3.1.0-py3.12.egg-info 2025-11-08 15:30:23,405 root INFO running install_scripts 2025-11-08 15:30:23,407 root INFO creating build/bdist.linux-aarch64/wheel/tika-3.1.0.dist-info/WHEEL 2025-11-08 15:30:23,408 wheel INFO creating '/home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/.dist/.tmp-ips0p6q_/tika-3.1.0-py3-none-any.whl' and adding 'build/bdist.linux-aarch64/wheel' to it 2025-11-08 15:30:23,408 wheel INFO adding 'tika/__init__.py' 2025-11-08 15:30:23,408 wheel INFO adding 'tika/config.py' 2025-11-08 15:30:23,409 wheel INFO adding 'tika/detector.py' 2025-11-08 15:30:23,409 wheel INFO adding 'tika/language.py' 2025-11-08 15:30:23,409 wheel INFO adding 'tika/parser.py' 2025-11-08 15:30:23,409 wheel INFO adding 'tika/pdf.py' 2025-11-08 15:30:23,410 wheel INFO adding 'tika/tika.py' 2025-11-08 15:30:23,410 wheel INFO adding 'tika/translate.py' 2025-11-08 15:30:23,410 wheel INFO adding 'tika/unpack.py' 2025-11-08 15:30:23,411 wheel INFO adding 'tika/tests/__init__.py' 2025-11-08 15:30:23,411 wheel INFO adding 'tika/tests/memory_benchmark.py' 2025-11-08 15:30:23,411 wheel INFO adding 'tika/tests/test_benchmark.py' 2025-11-08 15:30:23,411 wheel INFO adding 'tika/tests/test_from_file_service.py' 2025-11-08 15:30:23,411 wheel INFO adding 'tika/tests/test_ssl_link.py' 2025-11-08 15:30:23,412 wheel INFO adding 'tika/tests/test_tika.py' 2025-11-08 15:30:23,412 wheel INFO adding 'tika/tests/tests_params.py' 2025-11-08 15:30:23,412 wheel INFO adding 'tika/tests/tests_unpack.py' 2025-11-08 15:30:23,412 wheel INFO adding 'tika/tests/utils.py' 2025-11-08 15:30:23,413 wheel INFO adding 'tika-3.1.0.dist-info/licenses/LICENSE.txt' 2025-11-08 15:30:23,413 wheel INFO adding 'tika-3.1.0.dist-info/METADATA' 2025-11-08 15:30:23,413 wheel INFO adding 'tika-3.1.0.dist-info/WHEEL' 2025-11-08 15:30:23,413 wheel INFO adding 'tika-3.1.0.dist-info/entry_points.txt' 2025-11-08 15:30:23,413 wheel INFO adding 'tika-3.1.0.dist-info/top_level.txt' 2025-11-08 15:30:23,414 wheel INFO adding 'tika-3.1.0.dist-info/zip-safe' 2025-11-08 15:30:23,414 wheel INFO adding 'tika-3.1.0.dist-info/RECORD' 2025-11-08 15:30:23,414 root INFO removing build/bdist.linux-aarch64/wheel 2025-11-08 15:30:23,415 gpep517 INFO The backend produced .dist/tika-3.1.0-py3-none-any.whl tika-3.1.0-py3-none-any.whl *** Error compiling '/home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/.testenv/lib/python3.12/site-packages/tika/pdf.py'... File "/home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/.testenv/lib/python3.12/site-packages/tika/pdf.py", line 19 import .tika.parser ^ SyntaxError: invalid syntax ============================= test session starts ============================== platform linux -- Python 3.12.12, pytest-8.4.2, pluggy-1.6.0 -- /home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/.testenv/bin/python3 cachedir: .pytest_cache benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0 plugins: cov-5.0.0, benchmark-4.0.0 collecting ... collected 23 items / 2 deselected / 21 selected tika/tests/test_benchmark.py::test_local_binary PASSED [ 4%] tika/tests/test_benchmark.py::test_parser_buffer PASSED [ 9%] tika/tests/test_benchmark.py::test_parser_buffer_zlib_input PASSED [ 14%] tika/tests/test_benchmark.py::test_parser_buffer_gzip_input PASSED [ 19%] tika/tests/test_benchmark.py::test_local_binary_with_gzip_output PASSED [ 23%] tika/tests/test_benchmark.py::test_parser_buffer_with_gzip_output PASSED [ 28%] tika/tests/test_benchmark.py::test_parser_buffer_zlib_input_and_gzip_output PASSED [ 33%] tika/tests/test_benchmark.py::test_parser_buffer_gzip_input_and_gzip_output PASSED [ 38%] tika/tests/test_from_file_service.py::CreateTest::test_default_service PASSED [ 42%] tika/tests/test_from_file_service.py::CreateTest::test_default_service_explicit PASSED [ 47%] tika/tests/test_from_file_service.py::CreateTest::test_invalid_service PASSED [ 52%] tika/tests/test_from_file_service.py::CreateTest::test_meta_service PASSED [ 57%] tika/tests/test_from_file_service.py::CreateTest::test_remote_endpoint PASSED [ 61%] tika/tests/test_from_file_service.py::CreateTest::test_text_service PASSED [ 66%] tika/tests/test_tika.py::CreateTest::test_kill_server PASSED [ 71%] tika/tests/test_tika.py::CreateTest::test_local_binary PASSED [ 76%] tika/tests/test_tika.py::CreateTest::test_local_buffer PASSED [ 80%] tika/tests/test_tika.py::CreateTest::test_local_path PASSED [ 85%] tika/tests/test_tika.py::CreateTest::test_remote_html FAILED [ 90%] tika/tests/test_tika.py::CreateTest::test_remote_mp3 PASSED [ 95%] tika/tests/test_tika.py::CreateTest::test_remote_pdf PASSED [100%] =================================== FAILURES =================================== _________________________ CreateTest.test_remote_html __________________________ self = http_class = req = , http_conn_args = {} host = 'neverssl.com', h = headers = {'Connection': 'close', 'Host': 'neverssl.com', 'User-Agent': 'Python-urllib/3.12'} def do_open(self, http_class, req, **http_conn_args): """Return an HTTPResponse object for the request, using http_class. http_class must implement the HTTPConnection API from http.client. """ host = req.host if not host: raise URLError('no host given') # will parse host:port h = http_class(host, timeout=req.timeout, **http_conn_args) h.set_debuglevel(self._debuglevel) headers = dict(req.unredirected_hdrs) headers.update({k: v for k, v in req.headers.items() if k not in headers}) # TODO(jhylton): Should this be redesigned to handle # persistent connections? # We want to make an HTTP/1.1 request, but the addinfourl # class isn't prepared to deal with a persistent connection. # It will try to read all remaining data from the socket, # which will block while the server waits for the next request. # So make sure the connection gets closed after the (only) # request. headers["Connection"] = "close" headers = {name.title(): val for name, val in headers.items()} if req._tunnel_host: tunnel_headers = {} proxy_auth_hdr = "Proxy-Authorization" if proxy_auth_hdr in headers: tunnel_headers[proxy_auth_hdr] = headers[proxy_auth_hdr] # Proxy-Authorization should not be sent to origin # server. del headers[proxy_auth_hdr] h.set_tunnel(req._tunnel_host, headers=tunnel_headers) try: try: > h.request(req.get_method(), req.selector, req.data, headers, encode_chunked=req.has_header('Transfer-encoding')) /usr/lib/python3.12/urllib/request.py:1344: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/http/client.py:1338: in request self._send_request(method, url, body, headers, encode_chunked) /usr/lib/python3.12/http/client.py:1384: in _send_request self.endheaders(body, encode_chunked=encode_chunked) /usr/lib/python3.12/http/client.py:1333: in endheaders self._send_output(message_body, encode_chunked=encode_chunked) /usr/lib/python3.12/http/client.py:1093: in _send_output self.send(msg) /usr/lib/python3.12/http/client.py:1037: in send self.connect() /usr/lib/python3.12/http/client.py:1003: in connect self.sock = self._create_connection( /usr/lib/python3.12/socket.py:865: in create_connection raise exceptions[0] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ address = ('neverssl.com', 80), timeout = source_address = None def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT, source_address=None, *, all_errors=False): """Connect to *address* and return the socket object. Convenience function. Connect to *address* (a 2-tuple ``(host, port)``) and return the socket object. Passing the optional *timeout* parameter will set the timeout on the socket instance before attempting to connect. If no *timeout* is supplied, the global default timeout setting returned by :func:`getdefaulttimeout` is used. If *source_address* is set it must be a tuple of (host, port) for the socket to bind as a source address before making the connection. A host of '' or port 0 tells the OS to use the default. When a connection cannot be created, raises the last error if *all_errors* is False, and an ExceptionGroup of all errors if *all_errors* is True. """ host, port = address exceptions = [] for res in getaddrinfo(host, port, 0, SOCK_STREAM): af, socktype, proto, canonname, sa = res sock = None try: sock = socket(af, socktype, proto) if timeout is not _GLOBAL_DEFAULT_TIMEOUT: sock.settimeout(timeout) if source_address: sock.bind(source_address) > sock.connect(sa) E TimeoutError: [Errno 110] Operation timed out /usr/lib/python3.12/socket.py:850: TimeoutError During handling of the above exception, another exception occurred: urlOrPath = 'http://neverssl.com/index.html', destPath = '/tmp/index.html' def getRemoteFile(urlOrPath, destPath): ''' Fetches URL to local path or just returns absolute path. :param urlOrPath: resource locator, generally URL or path :param destPath: path to store the resource, usually a path on file system :return: tuple having (path, 'local'/'remote'/'binary') ''' # handle binary stream input if _is_file_object(urlOrPath): return (urlOrPath.name, 'binary') urlp = urlparse(urlOrPath) if urlp.scheme == '': return (os.path.abspath(urlOrPath), 'local') elif urlp.scheme not in ('http', 'https'): return (urlOrPath, 'local') else: filename = toFilename(urlOrPath) destPath = destPath + '/' + filename log.info('Retrieving %s to %s.' % (urlOrPath, destPath)) try: > urlretrieve(urlOrPath, destPath) tika/tika.py:778: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/urllib/request.py:240: in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: ^^^^^^^^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:215: in urlopen return opener.open(url, data, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:515: in open response = self._open(req, data) ^^^^^^^^^^^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:532: in _open result = self._call_chain(self.handle_open, protocol, protocol + /usr/lib/python3.12/urllib/request.py:492: in _call_chain result = func(*args) ^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:1373: in http_open return self.do_open(http.client.HTTPConnection, req) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = http_class = req = , http_conn_args = {} host = 'neverssl.com', h = headers = {'Connection': 'close', 'Host': 'neverssl.com', 'User-Agent': 'Python-urllib/3.12'} def do_open(self, http_class, req, **http_conn_args): """Return an HTTPResponse object for the request, using http_class. http_class must implement the HTTPConnection API from http.client. """ host = req.host if not host: raise URLError('no host given') # will parse host:port h = http_class(host, timeout=req.timeout, **http_conn_args) h.set_debuglevel(self._debuglevel) headers = dict(req.unredirected_hdrs) headers.update({k: v for k, v in req.headers.items() if k not in headers}) # TODO(jhylton): Should this be redesigned to handle # persistent connections? # We want to make an HTTP/1.1 request, but the addinfourl # class isn't prepared to deal with a persistent connection. # It will try to read all remaining data from the socket, # which will block while the server waits for the next request. # So make sure the connection gets closed after the (only) # request. headers["Connection"] = "close" headers = {name.title(): val for name, val in headers.items()} if req._tunnel_host: tunnel_headers = {} proxy_auth_hdr = "Proxy-Authorization" if proxy_auth_hdr in headers: tunnel_headers[proxy_auth_hdr] = headers[proxy_auth_hdr] # Proxy-Authorization should not be sent to origin # server. del headers[proxy_auth_hdr] h.set_tunnel(req._tunnel_host, headers=tunnel_headers) try: try: h.request(req.get_method(), req.selector, req.data, headers, encode_chunked=req.has_header('Transfer-encoding')) except OSError as err: # timeout error > raise URLError(err) E urllib.error.URLError: /usr/lib/python3.12/urllib/request.py:1347: URLError During handling of the above exception, another exception occurred: self = http_class = req = , http_conn_args = {} host = 'neverssl.com', h = headers = {'Connection': 'close', 'Host': 'neverssl.com', 'User-Agent': 'Python-urllib/3.12'} def do_open(self, http_class, req, **http_conn_args): """Return an HTTPResponse object for the request, using http_class. http_class must implement the HTTPConnection API from http.client. """ host = req.host if not host: raise URLError('no host given') # will parse host:port h = http_class(host, timeout=req.timeout, **http_conn_args) h.set_debuglevel(self._debuglevel) headers = dict(req.unredirected_hdrs) headers.update({k: v for k, v in req.headers.items() if k not in headers}) # TODO(jhylton): Should this be redesigned to handle # persistent connections? # We want to make an HTTP/1.1 request, but the addinfourl # class isn't prepared to deal with a persistent connection. # It will try to read all remaining data from the socket, # which will block while the server waits for the next request. # So make sure the connection gets closed after the (only) # request. headers["Connection"] = "close" headers = {name.title(): val for name, val in headers.items()} if req._tunnel_host: tunnel_headers = {} proxy_auth_hdr = "Proxy-Authorization" if proxy_auth_hdr in headers: tunnel_headers[proxy_auth_hdr] = headers[proxy_auth_hdr] # Proxy-Authorization should not be sent to origin # server. del headers[proxy_auth_hdr] h.set_tunnel(req._tunnel_host, headers=tunnel_headers) try: try: > h.request(req.get_method(), req.selector, req.data, headers, encode_chunked=req.has_header('Transfer-encoding')) /usr/lib/python3.12/urllib/request.py:1344: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/http/client.py:1338: in request self._send_request(method, url, body, headers, encode_chunked) /usr/lib/python3.12/http/client.py:1384: in _send_request self.endheaders(body, encode_chunked=encode_chunked) /usr/lib/python3.12/http/client.py:1333: in endheaders self._send_output(message_body, encode_chunked=encode_chunked) /usr/lib/python3.12/http/client.py:1093: in _send_output self.send(msg) /usr/lib/python3.12/http/client.py:1037: in send self.connect() /usr/lib/python3.12/http/client.py:1003: in connect self.sock = self._create_connection( /usr/lib/python3.12/socket.py:865: in create_connection raise exceptions[0] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ address = ('neverssl.com', 80), timeout = source_address = None def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT, source_address=None, *, all_errors=False): """Connect to *address* and return the socket object. Convenience function. Connect to *address* (a 2-tuple ``(host, port)``) and return the socket object. Passing the optional *timeout* parameter will set the timeout on the socket instance before attempting to connect. If no *timeout* is supplied, the global default timeout setting returned by :func:`getdefaulttimeout` is used. If *source_address* is set it must be a tuple of (host, port) for the socket to bind as a source address before making the connection. A host of '' or port 0 tells the OS to use the default. When a connection cannot be created, raises the last error if *all_errors* is False, and an ExceptionGroup of all errors if *all_errors* is True. """ host, port = address exceptions = [] for res in getaddrinfo(host, port, 0, SOCK_STREAM): af, socktype, proto, canonname, sa = res sock = None try: sock = socket(af, socktype, proto) if timeout is not _GLOBAL_DEFAULT_TIMEOUT: sock.settimeout(timeout) if source_address: sock.bind(source_address) > sock.connect(sa) E TimeoutError: [Errno 110] Operation timed out /usr/lib/python3.12/socket.py:850: TimeoutError During handling of the above exception, another exception occurred: self = def test_remote_html(self): """parse remote HTML""" > self.assertTrue(tika.parser.from_file('http://neverssl.com/index.html')) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ tika/tests/test_tika.py:37: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ tika/tika.py:328: in parse1 path, file_type = getRemoteFile(urlOrPath, TikaFilesPath) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ tika/tika.py:788: in getRemoteFile urlretrieve(urlOrPath, destPath) /usr/lib/python3.12/urllib/request.py:240: in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: ^^^^^^^^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:215: in urlopen return opener.open(url, data, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:515: in open response = self._open(req, data) ^^^^^^^^^^^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:532: in _open result = self._call_chain(self.handle_open, protocol, protocol + /usr/lib/python3.12/urllib/request.py:492: in _call_chain result = func(*args) ^^^^^^^^^^^ /usr/lib/python3.12/urllib/request.py:1373: in http_open return self.do_open(http.client.HTTPConnection, req) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = http_class = req = , http_conn_args = {} host = 'neverssl.com', h = headers = {'Connection': 'close', 'Host': 'neverssl.com', 'User-Agent': 'Python-urllib/3.12'} def do_open(self, http_class, req, **http_conn_args): """Return an HTTPResponse object for the request, using http_class. http_class must implement the HTTPConnection API from http.client. """ host = req.host if not host: raise URLError('no host given') # will parse host:port h = http_class(host, timeout=req.timeout, **http_conn_args) h.set_debuglevel(self._debuglevel) headers = dict(req.unredirected_hdrs) headers.update({k: v for k, v in req.headers.items() if k not in headers}) # TODO(jhylton): Should this be redesigned to handle # persistent connections? # We want to make an HTTP/1.1 request, but the addinfourl # class isn't prepared to deal with a persistent connection. # It will try to read all remaining data from the socket, # which will block while the server waits for the next request. # So make sure the connection gets closed after the (only) # request. headers["Connection"] = "close" headers = {name.title(): val for name, val in headers.items()} if req._tunnel_host: tunnel_headers = {} proxy_auth_hdr = "Proxy-Authorization" if proxy_auth_hdr in headers: tunnel_headers[proxy_auth_hdr] = headers[proxy_auth_hdr] # Proxy-Authorization should not be sent to origin # server. del headers[proxy_auth_hdr] h.set_tunnel(req._tunnel_host, headers=tunnel_headers) try: try: h.request(req.get_method(), req.selector, req.data, headers, encode_chunked=req.has_header('Transfer-encoding')) except OSError as err: # timeout error > raise URLError(err) E urllib.error.URLError: /usr/lib/python3.12/urllib/request.py:1347: URLError ----------------------------- Captured stderr call ----------------------------- 2025-11-08 15:30:49,646 [MainThread ] [INFO ] Retrieving http://neverssl.com/index.html to /tmp/index.html. ------------------------------ Captured log call ------------------------------- INFO tika.tika:tika.py:776 Retrieving http://neverssl.com/index.html to /tmp/index.html. =============================== warnings summary =============================== tika/__init__.py:20 /home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/tika/__init__.py:20: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. __import__('pkg_resources').declare_namespace(__name__) tika/__init__.py:20 /home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/tika/__init__.py:20: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('tika')`. Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages __import__('pkg_resources').declare_namespace(__name__) tika/tests/__init__.py:18 /home/buildozer/aports/community/py3-tika/src/tika-python-3.1.0/tika/tests/__init__.py:18: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('tika.tests')`. Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages __import__('pkg_resources').declare_namespace(__name__) ../../../../../../../usr/lib/python3.12/site-packages/pkg_resources/__init__.py:2558 /usr/lib/python3.12/site-packages/pkg_resources/__init__.py:2558: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('tika')`. Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(parent) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ----------------------------------------------------------------------------------------------- benchmark: 8 tests ----------------------------------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ test_parser_buffer_with_gzip_output 14.7440 (1.0) 25.3658 (1.0) 17.2098 (1.0) 1.9959 (1.05) 16.7730 (1.0) 1.4542 (1.0) 13;4 58.1065 (1.0) 65 1 test_local_binary_with_gzip_output 15.3155 (1.04) 29.8904 (1.18) 18.2300 (1.06) 2.6657 (1.41) 17.5230 (1.04) 1.6882 (1.16) 9;7 54.8548 (0.94) 63 1 test_parser_buffer_zlib_input_and_gzip_output 17.2208 (1.17) 28.6373 (1.13) 19.4177 (1.13) 1.8957 (1.0) 18.9664 (1.13) 2.2487 (1.55) 11;1 51.4994 (0.89) 56 1 test_parser_buffer_gzip_input_and_gzip_output 17.6873 (1.20) 36.1214 (1.42) 20.4757 (1.19) 3.6132 (1.91) 19.4765 (1.16) 2.7570 (1.90) 4;3 48.8383 (0.84) 46 1 test_parser_buffer 18.5629 (1.26) 64.4104 (2.54) 28.5675 (1.66) 9.2900 (4.90) 25.4024 (1.51) 7.5569 (5.20) 4;2 35.0048 (0.60) 29 1 test_parser_buffer_gzip_input 20.3579 (1.38) 30.6829 (1.21) 23.8464 (1.39) 2.9885 (1.58) 22.7476 (1.36) 3.6404 (2.50) 10;0 41.9351 (0.72) 36 1 test_parser_buffer_zlib_input 21.6497 (1.47) 37.7451 (1.49) 26.5979 (1.55) 3.9332 (2.07) 26.1492 (1.56) 2.1527 (1.48) 6;4 37.5969 (0.65) 19 1 test_local_binary 46.1433 (3.13) 67.6070 (2.67) 58.0179 (3.37) 9.5239 (5.02) 61.6241 (3.67) 16.7887 (11.54) 2;0 17.2361 (0.30) 5 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Legend: Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile. OPS: Operations Per Second, computed as 1 / Mean =========================== short test summary info ============================ FAILED tika/tests/test_tika.py::CreateTest::test_remote_html - urllib.error.U... ====== 1 failed, 20 passed, 2 deselected, 4 warnings in 576.48s (0:09:36) ====== >>> ERROR: py3-tika: check failed >>> py3-tika: Uninstalling dependencies... ( 1/66) Purging .makedepends-py3-tika (20251108.153020) ( 2/66) Purging py3-setuptools-pyc (80.9.0-r2) ( 3/66) Purging py3-setuptools (80.9.0-r2) ( 4/66) Purging py3-gpep517-pyc (19-r1) ( 5/66) Purging py3-gpep517 (19-r1) ( 6/66) Purging py3-installer-pyc (0.7.0-r2) ( 7/66) Purging py3-installer (0.7.0-r2) ( 8/66) Purging py3-wheel-pyc (0.46.1-r0) ( 9/66) Purging py3-wheel (0.46.1-r0) (10/66) Purging py3-pytest-benchmark-pyc (4.0.0-r4) (11/66) Purging py3-pytest-benchmark (4.0.0-r4) (12/66) Purging py3-py-cpuinfo-pyc (9.0.0-r4) (13/66) Purging py3-py-cpuinfo (9.0.0-r4) (14/66) Purging py3-pytest-cov-pyc (5.0.0-r1) (15/66) Purging py3-pytest-cov (5.0.0-r1) (16/66) Purging py3-pytest-pyc (8.4.2-r1) (17/66) Purging py3-pytest (8.4.2-r1) (18/66) Purging py3-iniconfig-pyc (2.3.0-r0) (19/66) Purging py3-iniconfig (2.3.0-r0) (20/66) Purging py3-packaging-pyc (25.0-r0) (21/66) Purging py3-packaging (25.0-r0) (22/66) Purging py3-parsing-pyc (3.2.3-r0) (23/66) Purging py3-parsing (3.2.3-r0) (24/66) Purging py3-pluggy-pyc (1.6.0-r0) (25/66) Purging py3-pluggy (1.6.0-r0) (26/66) Purging py3-py-pyc (1.11.0-r4) (27/66) Purging py3-py (1.11.0-r4) (28/66) Purging py3-pygments-pyc (2.19.2-r0) (29/66) Purging py3-pygments (2.19.2-r0) (30/66) Purging py3-coveralls-pyc (3.3.1-r1) (31/66) Purging py3-coveralls (3.3.1-r1) (32/66) Purging py3-coverage-pyc (7.11.0-r0) (33/66) Purging py3-coverage (7.11.0-r0) (34/66) Purging py3-docopt-pyc (0.6.2-r11) (35/66) Purging py3-docopt (0.6.2-r11) (36/66) Purging py3-requests-pyc (2.32.5-r0) (37/66) Purging py3-requests (2.32.5-r0) (38/66) Purging py3-certifi-pyc (2025.10.5-r0) (39/66) Purging py3-certifi (2025.10.5-r0) (40/66) Purging py3-charset-normalizer-pyc (3.4.4-r0) (41/66) Purging py3-charset-normalizer (3.4.4-r0) (42/66) Purging py3-idna-pyc (3.10-r0) (43/66) Purging py3-idna (3.10-r0) (44/66) Purging py3-urllib3-pyc (1.26.20-r0) (45/66) Purging py3-urllib3 (1.26.20-r0) (46/66) Purging py3-yaml-pyc (6.0.3-r0) (47/66) Purging py3-yaml (6.0.3-r0) (48/66) Purging python3-pyc (3.12.12-r0) (49/66) Purging python3-pycache-pyc0 (3.12.12-r0) (50/66) Purging pyc (3.12.12-r0) (51/66) Purging python3 (3.12.12-r0) (52/66) Purging openjdk21-jre-headless (21.0.9_p10-r0) (53/66) Purging java-common (1.0-r1) (54/66) Purging java-cacerts (1.1-r0) java-cacerts-1.1-r0.pre-deinstall: Executing script... (55/66) Purging p11-kit-trust (0.25.5-r2) (56/66) Purging ca-certificates (20250911-r0) ca-certificates-20250911-r0.post-deinstall: Executing script... (57/66) Purging gdbm (1.26-r0) (58/66) Purging libbz2 (1.0.8-r6) (59/66) Purging libpanelw (6.5_p20251010-r0) (60/66) Purging mpdecimal (4.0.1-r0) (61/66) Purging p11-kit (0.25.5-r2) (62/66) Purging sqlite-libs (3.51.0-r0) (63/66) Purging xz-libs (5.8.1-r0) (64/66) Purging yaml (0.2.5-r2) (65/66) Purging libffi (3.5.2-r0) (66/66) Purging libtasn1 (4.20.0-r0) busybox-1.37.0-r24.trigger: Executing script... OK: 431 MiB in 105 packages