README.md 4.91 KB
Newer Older
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
1
2
# PyPal

Marcelo Dias de Amorim's avatar
Marcelo Dias de Amorim committed
3
PyPal is used to synchronize and merge Wi-Fi traces captured by sniffers. It is an evolved Python version of the Wipal tool [1]. One of the main features of the tool is that it creates "per MAC address" traces.
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
4
5

The tool takes two traces (in csv or txt format) as input and then performs the option you select. You would need to have the following fields in the traces:
Marcelo Dias de Amorim's avatar
Marcelo Dias de Amorim committed
6

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
7
8
9
10
11
12
13
14
15
16
17
18
- frame.number: Frame_number
- frame.time_epoch: Frame_time_epoch
- wlan.fixed.timestamp: Fixed_timestamp
- wlan_radio.signal_dbm: RSSI_dBm
- wlan_radio.channel: Channel
- wlan.fc.type: Frame_type
- wlan.fc.type_subtype: Frame_subtype
- wlan.fc.retry: Retransmission
- wlan.fcs: Checksum
- wlan.sa: Source_MAC_address
- wlan.seq: Sequence_number
- wlan.frag: Fragment_number
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
19

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
20
21

You can use the following tshark command to extract the above mentioned fields from a pcap file.
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
22

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
23
24
25
26
27
tshark -r pcap_input_file -Y '!_ws.malformed and wlan_radio.channel==1' -T fields -E header=y -E separator=/t -e frame.number -e frame.time_epoch -e wlan.fixed.timestamp -e wlan_radio.signal_dbm -e wlan_radio.channel -e wlan.fc.type -e wlan.fc.type_subtype -e wlan.fc.retry -e wlan.fcs -e wlan.sa -e wlan.seq -e wlan.frag > csv_or_txt_output_file

**It is, however, essential to clearly define which data one can sniff depending on the location of the measurement campaign to preserve the privacy of the users. It is also necessary to carry out hashing of MAC addresses to preserve the privacy.**

**Steps involved in synchronization:**
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
28

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
29
30
31
The beacons are the closest representatives of real-time clocks. We use these frames as a base for the synchronization of traces. Two traces are used as input, one as a reference trace and the second trace is the one which has to be synchronized. The first step is to independently extract the beacons that are common in both traces. Hence, the coverage areas of the sniffers capturing these traces should overlap to perform this step. The common frames are referred to as reference frames. In the next step, the timestamps of reference frames are synchronized using linear regression over a sliding window of 3 frames. The synchronized reference frames are then used to synchronize the complete trace. The tool provides an additional option of concatenating or merging the synchronized traces [1].

**How to run to tool:**
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
32

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
33
It is preferable to use Python3.
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
34

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
35
python3 pypal.py -h will also show you the information on how to operate the tool.
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
36

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
37
38
39
40
41
You need to have the following libraries installed:
- numpy
- pandas
- scikit-learn

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
42
The tool has to positional arguments and those are the two traces:
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
43
44
45
- trace1: trace to be synchronized
- trace2: reference trace

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
46
There are several optional arguments but you have to tell the tool which one you want to use. You can use only one optional argument at a time. The arguments are given below:
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
47
48
49
50
51
52
53
- -U : extract unique frames
- -R : extract unique reference frames
- -SR : synchronize unique reference frames
- -S : synchronize traces
- -C : concatenate traces (and keep the duplicate frames)
- -M : merge the traces and remove the duplicate frames within a time difference of 106µs.

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
54
55
56
57
58
59
60

The time synchronization error (the difference between two timestamps of different sniffers for the same frame) has to be less than half the minimum gap between two valid IEEE 802.11 frames. In the IEEE 802.11b protocol, the minimum gap can be calculated as the 192 microsecond preamble delay plus 10 microsecond SIFS (Short Inter-Frame Space) plus 10 microsecond minimum transmission time for a MAC frame, to be a total of 212 microseconds [2].
So the precision is 212/2 = 106µs

[1] T. Claveirole and M. Dias de Amorim, “Wipal: Efficient offline merging of ieee 802.11 traces,” SIGMOBILE Mob. Comput. Commun. Rev., vol. 13, no. 4, p. 39–46, Mar. 2010. [Online]. Available: https://doi.org/10.1145/1740437.1740443

[2] J. Yeo, M. Youssef, and A. Agrawala, “A framework for wireless lan monitoring and its applications,” in Proceedings of the 3rd ACM Workshop on Wireless Security, ser. WiSe ’04. New York, NY, USA: Association for Computing Machinery, 2004, p. 70–79. [Online]. Available: https://doi.org/10.1145/1023646.1023660
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
61
62

Please do cite the technical report of the tool:
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
63

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
64
65
@techreport{syed:hal-03618014,
  TITLE = {{PyPal: Wi-Fi Trace Synchronization and Merging Python Tool}},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
66
  AUTHOR = {Syed, Mohammad Imran and Flandenmuller, Anne and Dias de Amorim, Marcelo},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
67
68
  URL = {https://hal.archives-ouvertes.fr/hal-03618014},
  TYPE = {Technical Report},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
69
  INSTITUTION = {{LIP6 UMR 7606, UPMC Sorbonne Universit{\'e}, France}},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
70
71
  YEAR = {2022},
  MONTH = Mar,
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
72
  PDF = {https://hal.archives-ouvertes.fr/hal-03618014v3/file/PyPal.pdf},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
73
  HAL_ID = {hal-03618014},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
74
  HAL_VERSION = {v3},
Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
75
76
}

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
77
78
M. I. Syed, A. Flandenmuller, M. Dias de Amorim. PyPal: Wi-Fi Trace Synchronization and Merging Python Tool. [Technical Report] LIP6 UMR 7606, UPMC Sorbonne Université, France. 2022. ⟨hal-03618014v3⟩

Mohammad Imran Syed's avatar
Mohammad Imran Syed committed
79
**If you face any difficulties, please feel free to contact via email: mohammad-imran.syed@lip6.fr**