Realm 1: Field Survey and Site Collections (https://doi.org/10.7302/pfab-2b45) Read me file* Description of PASH Deep Blue Data Realm 1: Field Survey and Site Collections All databases, field notebooks, paper maps, GIS files, photographs, and photo descriptions related to the intensive survey, of tracts and tumuli, and the collection of sites have been made available in PASH Deep Blue Data Realm 1. Over the course of five years, 11 field teams (Teams A–K) surveyed 2530 tracts in Shkrel and Shtoj, covering 16.1 km2. Survey data were eventually collapsed into six geographic zones (1–6), as described below (see Figure 1.6). Survey teams documented 172 tumuli in the plains of Shtoj and Shkrel and discovered five previously unknown sites: Kullaj (Site 002), Rasek (Site 005), Kodër Boks (Site 007), Omaraj (Site 009), and Fashina Hill (Site 013). PASH grid-collected one cave site, Shpella e Hudhrës (Site 004; using 1x1 m grid squares), and four hillfort sites using 20x20 m grid systems: Kratul i Madh (Site 001), Kullaj (Site 002), Vorfë (Site 003), and Gajtan (Site 011). We tract-collected five sites: the densely overgrown hill settlements of Marshej (Site 012) and Zagorë (Site 015), the Medieval–Modern castle of Drisht (Site 017) and neighboring hilltop of Muzhile, the newly discovered prehistoric site of Kodër Boks (Site 007), and Fashina Hill (Site 013), located just to the west of Gajtan. We also collected the newly identified Paleolithic sites of Rasek (Site 005) and Omaraj (Site 009), located along the Kir River, using microtracts (walking shoulder to shoulder). Finally, PASH visited and documented the rock art site of Derraj (Site 010). Three settlements were test excavated (Gajtan, Kodër Boks, and Zagorë). Four tumuli (085, 052, 088, and 099) were subjected to rescue excavations. All tracts were surveyed using standard Mediterranean survey methods. For each tract (recorded consecutively by team letter and a number, e.g. A-001, A-002, etc.), surveyors walked at 15-m intervals and counted all tile/brick, ceramic fragments, and small finds. The category of “small finds” included, but was not limited to, chipped stone, metal tools, spindle whorls, loom weights, grinding stones, and glass. Field walkers were instructed to collect all small finds as well as all diagnostic pottery—i.e., potsherds having a recognizable form or features (such as rim, base, or handle) or decoration (such as paint, glaze, incision, etc.). Pottery was also sampled based on fabric type (color, texture, surface treatment, etc.). Only sherds that were larger than a thumbnail were counted and collected. We conducted a full-coverage survey, meaning that all land forms—including fields, hills, and terraces—in each survey zone were surveyed, unless the landowner objected or the vegetation was so dense as to render survey impossible. Visibility (measured as a percentage, 0% indicated no visible ground surface and 100% indicated a completely visible ground surface) varied by zone, but was generally high; the overall visibility for the whole survey was 60%. Each tract was photographed and a GPS point at the center of the tract was obtained. Photos and photo descriptions were maintained in separate databases by each team. Information about tracts was recorded by hand in notebooks by team leaders. This included data about a tract’s soil, geology, ground cover (plants growing in the tract, crops planted in the field), associated structures, associated archaeological features (including tumuli), informant testimony, visibility, and, most critically, artifact counts. These data were entered into a dedicated database (“CU,” collection unit) each evening. Team leaders also mapped tracts onto preprinted aerial photographs while in the field. Tracts were then drawn into a geographic information system (GIS) as shapes and linked to the collection unit database via tract number. This allowed us to generate maps of survey data, including artifact counts, in real time. Tumuli (recorded consecutively by number with a “T” prefix: T-001, T-002, etc.) were mapped, measured, photographed, and described separately, and these data were entered into a separate database dedicated to tumuli. Our study region encompassed several known sites that were targeted for further investigation. New sites were identified when artifact densities by tract were clustered, creating a so-called “hot spot.” Additional artifact clusters were treated not as sites, but as places of special interest or activity areas, which were visited and further evaluated. All PASH sites were given a site number (Site 001, Site 002, and so on) and described in a site database. All sites were targeted for additional study, whether surface collection and/or excavation. Site collection strategies varied by site, depending on collection goals, terrain, and visibility. Generally speaking, the main reasons to conduct site collection were: (1) to determine site size and edges; (2) to clarify site chronology, i.e. periods represented; (3) to inform ideas about site function; and (4) to guide geophysics and test excavations. In principle, all four of these site-collection goals could have been addressed at every site. In practice, and at most sites, some goals were met, but not all. Three site-collection methods were employed. Hillforts and the cave site, Shpella e Hudhrës, were grid-collected. At the hillforts, 20x20 m squares were laid out using 50-m tapes and compasses, running north-south along the y-axis, or following the natural topography of the site. At the cave, 1x1 m grid squares were employed, laid out with tapes along the cave’s vertical axis. These grid collections are discussed in more detail below. Sites that were very large (e.g., Drisht, Kodër Boks) or characterized by steep terrain and poor visibility (e.g., Marshej, Zagorë) were tract collected. Two sites were microtracted: Rasek and Omaraj. Previous tracts, used to identify these sites, were resurveyed by individuals walking shoulder to shoulder, thereby recovering many more artifacts (in these cases, chipped stone). Realm 1 is divided into two sub-collections: Survey Data and Site Data. Both sub-collections are organized by data type. Survey Data (https://doi.org/10.7302/fhp2-2k12) Each data type in the Survey Data sub-collection is organized by survey team (A-K), each of which operated in a different location. Each survey team record includes the following works: Tract photos and photologs – Folders with photos taken of and in tracts, by a survey team member, using a digital camera, organized by date taken. These are JPEGS. Each photo is labelled with the date taken and an appended ID number, in consecutive order (e.g., A-020610-001 = A team, June 2, 2010, photo 001). ID numbers were repeated each subsequent day of survey, beginning again with 001. At minimum, each tract was photographed to provide documentation of it location, orientation, and ground cover. Photos were also taken of features within tracts: old buildings, tumuli, caves, etc. A small number of candid shots of survey work are also included. Photolog. This file is a .CSV export from Excel. The photolog is a list of each photo taken by a team, in order, by photo number (date, ID number), with a description. Folders within each .zip file correspond to “Sequence Number” (Column A) in the photolog. Survey maps – A PDF of scans of the original tract maps drawn in the field by each team leader. These were digitized each night to create shape files for each tract in the PASH Geographic Information System (GIS). Spatial data files – GIS shape files for each tract along with additional, generic spatial data, including files for tract visibility, overall pottery density, and overall tile density. The latter two are not chronologically specific; they include all pottery and tile counts by tract, regardless of age. CU (survey) database – The full CU (Collection Unit, i.e. “tract”) database, which includes all tract-survey data from all teams together in one place. This file is a .CSV export from FileMaker. Each entry includes data about each tract surveyed (see data dictionary). Tract locations are available via accompanying GIS shape files. [NOTE: some tract database entries lack complete location data, e.g., a UTM Northing is present but not the Easting. These are available via the spatial – shape file – data.] CU (survey) database, by team – A copy of each team’s (A-K) Collection Unit (CU; i.e. “tract”) database is also included. These files are .CSV exports from the original FileMaker database. CU (survey) database data dictionary: Tract number – each tract was labeled with the team letter as a prefix followed by a number in consecutive order: e.g., A-001 – the first tract surveyed by A team in 2010, A-002 – the second, and so on Date surveyed – date on which the tract was surveyed Team leader – ID number of the individual who led the survey team Notebook pages – pages in the team leader notebook where a tract is described Team members – ID numbers of the individuals who composed the survey team on any given day Visibility – the degree to which a tract’s ground surface could be seen, reported as a percentage, with 100 equal to a fully visible ground surface Vegetation cover – general description of the plant or plants growing in a particular tract, e.g., bushes, trees, grass, etc. Comments – general description of the tract, including any features and any informant testimony Photo no. – the numbers of the photos taken of each tract. Each tract photo file is labeled with this number and recorded in the corresponding photolog Date recorded – date on which data was entered into the database Recorder – individual(s) who entered the data Other – number of small finds counted in a tract Pottery – number of pot sherds counted in a tract Tile – number of tile fragments counted in a tract Field team reports – PDFs of the reports written by survey team leaders at the end of the season, including the report as submitted and a final edited version. [NOTE: in some cases, only the final edited version of a report is included.] Field notebooks – PDFs of the original field notebooks kept by survey team leaders. [NOTE: some of these are poorly scanned and difficult to read.] Site Data (https://doi.org/10.7302/j87w-mt17) Each data type in the Site Data sub-collection is organized by site, of which there are 17. Each site was given a number, with the prefix S for site, e.g. S001 = Site 001 = Kratul i Madh, S002 = Site 002 = Kullaj, and so on. There are site data for many but not all sites. For those lacking a site record, users should consult the Site Database. Sites that are tumuli (Sites 006, 008, 014, and 016) have additional supplemental data associated with Realm 3. Each site record includes the following works: Site database – .CSV file that includes descriptions of each site. The file is a .CSV export from the original FileMaker database. Site database data dictionary: Site no. – number assigned to site, e.g., S001, S002, etc. Name – local toponym, or PASH number in the case of tumuli Location – location of the site in UTM coordinates Periods represented – relative chronology represented at the site based on pottery analysis Chronology notes – description of the site based on periods of occupation Directions to site – general description of how to get to the site Features – large, immovable objects or structures at the site, such as walls, houses, etc. Possible function – proposed site use, e.g., hillfort Geological observations – brief description of the geological setting Collection method – short description of how artifacts were collected at the site, if it was systematically collected Preservation – degree to which the site has been damaged, and if so, how Size – estimated area of site in hectares Team leader(s) – team leaders who oversaw site collection, or site reconnaissance Visibility – average ground visibility across site, measured from 0 to 100, with 100 being complete ground visibility Site documentation – PDFs of scans of miscellaneous documents related to a particular site, including maps, wall drawings, original notes, etc. For those sites that were systematically surface collected (Sites 001, 002, 003, and 011), scans of the site collection grid and raw counts of collected artifacts (on a “Site Collection Form”) are also included. Site photos and photologs – A folder with photos taken of each site. These are JPEGS. Some (e.g., S004) are labelled with site number, date taken, and an appended ID number in consecutive order (e.g., S001-060610-001 = Site 001, June 6, 2010, Photo 001 from Kratul i Madh). Others are labeled generically. ID numbers were repeated at each subsequent site surveyed, beginning again with 001. Some site photos are accompanied by a photolog, while others are not. Photolog files are .CSV exports from Excel. The photolog typically lists each photo taken of a site in order by photo number, with a description. *(Note that the realm-level information below is also included in the “Documentation for the PASH Collection” record attached to the overall PASH collection record: https://deepblue.lib.umich.edu/data/concern/data_sets/6t053g548. Subcollection- and file-level information is included only in this document.)