Multi-platform Process Flow Models and Algorithms for Extraction and Documentation of Digital Forensic Evidence from Mobile Devices

The increasing need for the examination of evidence from mobile and portable gadgets increases the essential need to establish dependable measures for the investigation of these gadgets. Many differences exist while detailing the requirement for the examination of each gadget, to help detectives and examiners in guaranteeing that of any kind piece of evidence extracted/ collected from any mobile devices is well documented and the outcomes can be repeatable, a reliable and well-documented investigation process must be implemented if the results of the examination are to be repeatable and defensible in courts of law. In this paper we developed a generic process flow model for the extraction of digital evidence in mobile devices running on android, Windows, iOs and Blackberry operating system. The research adopted survey approach and extensive literature review a s means to collect data. The models developed were validate through expert opinion. Results of this work can guide solution developers in ensuring standardization of evidence extraction tools for mobile devices.


INTRODUCTION
Attempts to use a range of mobile forensic tools and process models to extract information from multiple devices have yielded conflicting results [1]- [3]. Therefore, special attention should be paid to ensure that the methods are correct so that usability improvement can be achieved [4]. The overriding importance of documentation approaches is that they can allow an investigator to remember the steps taken to gather information, which in turn reduces allegations of mishandling [5].
The scientific work of most researchers confirms that forensic science suffers from a lack of documentation and transparency [6]. Therefore, standard and well-researched approaches to documentation and extraction are key. The purpose of the documentation is to facilitate the extraction process in legally acceptable ways [7], [8]. While the investigator would do well to extract the necessary information using the tools available, further details on the information could only be useful for judicial proceedings [9].
The term digital forensics refers to the process of retrieving and examining documents from digital devices, primarily involving computer crime or cybercrime [10], [11]. The role of forensic science is to use investigative methodologies, measures, and frameworks to extract, preserve, collect, analyze, and provide [12] scientific and technical scraps of evidence to criminal or civil courts and tribunals. to organize a good documentation of the prosecutions. On the other hand, digital forensics is the practice of finding, securing, examining and presenting evidence in a legally acceptable manner [12]. These definitions are supported by [13] who state that digital evidence is considered investigatively relevant material and records that are stored, delivered, or transmitted via an electronic device.
The steady industrial growth and growing popularity of mobile digital devices amplify the challenges, conditions and scenarios for investigators and prosecutors around the world. The existence of different tools and systems with different process models makes it difficult even for a trained investigator to select a suitable forensic tool to seize internal files of mobile devices [14]. Many forensic models emphasize auditing of certain operating system platforms [15], ignoring a more critical aspect of consistency and documentation of the approaches and steps taken. While [16] listed many forensic techniques for preserving evidence from the point of view of efficiency in the general forensic context for extracting and documenting evidence from mobile devices. Little effort has been made regarding the methodological documentation and the consistency of the process models followed when extracting this information. While [17] notes that despite growing awareness and research on forensic practice, explanation and implementation are still inconsistent in the digital forensic community, a topic supported by recent research such as [9], [18], [19].
Continuously changing technological and industry developments, coupled with the myriad of complexities caused by today's demand for information from mobile devices, present forensic investigators with serious adaptive challenges to standardize and adopt acceptable models that can be used to detect this in order to counter the growing demand [20], [21].
The reliability of the evidence is directly anchored to the investigative processes adopted. Therefore, choosing to avoid a step can lead to insufficient evidence and increase the risk of denying that step in a legal proceeding [22]. Currently, no standard or universally accepted process model has been developed that can be used to obtain evidence from mobile devices, and the vibrant expansion of smart devices suggests that every forensic investigator will need to use all independent models needed to gather information and keep [23].
Existing models cannot meet the growing demands for digital evidence resulting from the growing use of mobile devices and the complexity that persistent criminals bring to the use of these devices. Therefore, some of these models focus on a specific step of the mining process or depend on the operating system platform [24], Based on existing research in digital forensics, process models can be used to collect evidence on mobile devices. In general, the literature specifies the requirements that guide and measure the process of extracting digital evidence in mobile devices and their performance. These include reliability and validity, guidelines, extraction methods, nature of data, type of data, technical documentation, and forensic extraction tools.

METHODOLOGY
The present study was performed in four steps depicted in Figure 1. In the first phase, the literature on specific email security techniques was reviewed, in phase two, the algorithm was developed and in phase three, the algorithm was evaluated using questionnaires selected from the participants and a SWOT analysis was carried out in the last phase.

Study Area, Design and Period
The research was conducted in Kampala, Uganda, as this is where the researcher found most of the respondents with knowledge of the subject. From this position, the investigator was able to identify law enforcement such as police, bailiffs, computer forensics experts and professionals, evidence mining and computer forensics investigators, mobile telecommunications, and banking sectors that have various forms of crime /fraud. departments for investigating crimes related to the use of technology. The cross-sectional study design was used in this study over a one-year period from 2018 to 2019.

Population and Sample Size
The study population was comprised of law enforcement respondents, specifically Uganda Police (Crime Intelligence and Investigation Department (CIID), the prosecution service), court officials (lawyers, registrars, judges and magistrates) , policy makers, people regulators such as; Uganda Communications Commissions (UCC), National Information Technology Authority Uganda (NITA-U), a business community made up of telecommunications operators such as Mobile Telecommunication Network (MTN-Uganda), Airtel Uganda as these are the largest telecommunications service providers offering financial services, banks such as Stanbic Bank, Centenary Bank, Barclay's Bank Uganda and Standard Chartered Bank, as these are the largest providers of online transaction systems using some of the mobile digital devices in their operations. In addition to the snowball sampling tool, targeted/forensic sampling was used to complement targeted sampling, especially when examining different operating system platforms, inconsistencies and from the technical documentation of mining process models, while simple random and stratified sampling was used for probability sampling because the researcher collected data from different sectors and classified them into different strata and sampling simple random has been applied. The sample population was determined using the sample table of Krejcie and Morgan [26] derived from the formula. Krejcie and Morgan's sample size calculation presented in Table 1 was based on p = 0.05, where the probability of making a Type I error is less than 5% or p < 0.05 [26]. It is clear that the population size of 10 was considered for law enforcement agencies, and the sample size of 7 was used. While large number of the respondents came from ICT experts with the sample population of 100 and the sample size of 63. This was followed by the business community (people in the banking industry, telecommunication agencies) with the population size of 70, and the sample size of 31.

Data Collection
Questionnaires and interviews were used in this study. The questionnaires covered a wide range of segments of the selected population, provided a consistent form of response, reduced bias, did not make people anxious, and were completed at the discretion of the respondent [27]. Questionnaires were designed for different categories of respondents such as policymakers, law enforcement, researchers, ICT experts, regulators and the business community to obtain different types of data from these categories of respondents. Questionnaires were developed based on understanding gained from the literature reviewed in areas such as mobile devices, operating systems, platforms, technical documentation, inconsistency and complexity of process models as independent variables, and a cross-platform digital extraction process model for mobile device forensic evidence. The questionnaires were designed using the standard five-point Likert scale ranging from strongly agree to strongly disagree. The interviews were used to complement the questionnaires and were tightly structured, conducted primarily for information and communication technology (ICT) experts within law enforcement, policy makers, regulators and industry, as well as for those in the data recovery and forensic departments of agencies such as telecommunications networks, the banking sector and researchers in the field of digital banknote forensics.

Data Quality Assurance
The term "reliability" is used to describe the "repeatability" or "consistency" of the measure [28]. The internal consistency reliability methodology was used in this study. According to Chen [29], the internal consistency method uses a single measure administered once to a group of people to estimate reliability. The reliability of the tool is assessed by estimating how well elements with the same construct produce comparable results. Cronbach's alpha (α) coefficient was chosen as the best approximation to estimate the reliability of the constructs by examining the internal consistency of the measure. As indicated by Spencer [30], there are four types of reliability coefficients α; excellent reliability (α> = 0.90), high reliability (0.70 <α <0.90), moderate reliability (0.50 <α <0.70) and low reliability (α <= 0.50). All constructs used in this study passed the reliability test as shown in Table 2. In this study, the highest Cronbach's alpha (α) of 0.850 was achieved by the FET constructs, while the lowest was achieved by the PF constructs (α = 0.591). As reported by Perry et al [28], these figures indicate that out of 8 constructs, 5 had high fidelity, while three 3 had moderate fidelity, implying that the constructs were internally consistent. Therefore, all elements of each construct were measured equally. Although the validity of the instruments was determined using the Content Validity Index (CVI), it was performed on the constructs to ensure that the elements of the scale were meaningful to the sample and to record the measured problems. The measurement tools were then tested to ensure their quality and validity; This happened after conducting a pilot study with 30 questionnaires. The content validity indices of the three experts are 0.982, 0.964 and 0.967. Therefore, it was observed that the content validity coefficients were >0.6 and therefore the scales used to measure the study variables were consistent. Moreover, it is valid because a Cronbach's alpha greater than 0.5 is considered moderate validity and greater than 0.90 excellent validity. In this study, all variables were greater than 0.50, indicating good to excellent validity, meaning that all constructs and sub-indices in this study passed the validity tests.

Ethical Consideration
Ethical approval for the survey was obtained from the Institutional Research Ethics Board of Busitema University and informed consent from respondents prior to their voluntary enrolment in the study.
Ethical aspects such as data protection and respondent confidentiality were ensured [31]. Additionally, the letter was acquired by the university, which served as an introductory document for various organizations and individuals involved in this research. It has also been guaranteed that the developed mining model does not perform any unintended/unknown activity on users' devices.

Statistical Analysis
The analysis was performed using Statistical Package Software for Social Scientist (SPSS) version 20.0 (SPSS, Chicago, Illinois) and descriptive statistics were used to extract results from the analysis of all study variables. Descriptive statistic was performed for all the constructs to determine their significance using the mean responses. This was then used to obtain the ranking as per the number of responses from the participants who were contributors to inconsistencies in mobile device evidence extraction process models. Regression analysis was done with consistency metric (CM) as the dependent variable and constructs including EM, FET, PF, DF, ND, and DTF as independent variables.

Multi-Platform Flow Model
The model design and validation involving the use of the business process, model development, analytical hierarchy approach (AHA), and experimental and experts' opinion used to validate the developed model. An experimental setup was conducted to test the process model developed to check for consistency in the extraction process models. The process flow for the multi-platform model is depicted in Figure 2. The individual flow models for the iOS and Windows mobile devices are presented in Figure 3 and Figure 4, respectively.

Description of Extraction Algorithms
First and foremost, the gadget is seized for evidence extraction. A check is made to determine what type of operating system it is running. In case of Android OS, the Android extraction process is performed under the Extract From Android (SiezedDevice). It starts with checking the status of the gadget like power, Wi-Fi connection and cellular network. This action is performed on all gadgets to ensure that each gadget has power and does not have network connection issues. After this check, Universal Serial Bus debugging is enabled through developer options, screen timeout is prolonged, and root access is achieved. Then, different directories/ locations are browsed to obtain the SQLite database that can be opened to collect evidence that is documented using Documents (directory dictionary). The procedure is followed in similar steps, while the documentation is guaranteed to allow for consistency.
In the case of an iOS, as depicted in Figure 5, Extract From iOS (SiezedDevice) is trailed with the same action of having the gadget status checked; however, the difference with this extraction happens when connecting to a personal computer where a trusted code is required between the device and computer for the cases of iOS11 and above. Documentation occurs through (directory, dictionary). During extraction from Windows devices, as shown in Figure 6, Extract From Windows (SiezedDevice) is activated, which necessitates installing windows phone SDK and Zune software, the windows phone device manager. The gadget status checking is done. Once the gadget is connected to the workstation, the automatic installation of Touch Xperience on the phone is follows. This allows various directories to be browsed and several files accessed, and the documentation is followed by Documents (directory dictionary).
Finally, for BlackBerry-based gadgets, there are relatively small variations from other devices; Extract from BlackBerry (SiezeDevice) is done, and information /data is acquired from backup files as opposed to the device itself since its security complexity. BlackBerry Desktop Software is installed and opened, which detects a blackberry device and creates backup files. The files are browsed for evidence which is documented in Documents (directory dictionary).

Validation of the Model
The developed model was validated using two approaches, namely, experts' opinions and literature comparison. In the first approach, expert opinion was based on the model applicability and functionality. The experts used were purposely selected from information technology, information security and computer forensic and network security fields, law enforcement agencies, solution developers as well as researchers in the field of computer and digital forensics. The second approach was through comparison with the previous models in the literature.

Applicability and Functionality of the Model
Descriptive statistics were used to assess the applicability of the model in measuring the state of process models (digital forensic evidence extraction) for mobile devices, based on the feedback from the experts in the fields of digital forensic evidence extraction. The model validation based on applicability using descriptive statistics is depicted in Table 3. The analysis of all elements within the applicability of the developed model shows that 86.6% of the participants confirmed the applicability of the developed digital forensic evidence extraction model in driving the digital forensic evidence extraction process for mobile devices. On the other hand, only 13.4% of the participants disagreed on the applicability of this model in digital forensic evidence extraction process models for mobile devices. The results amply demonstrate the applicability of the model in the process of extracting digital forensic evidence for mobile devices, with 86.6% embracing it. On the other hand, the functionality of the developed Digital Forensic Evidence Extraction Process Model was validated as depicted in Table 4. It was observed that 6.4% of the respondents had a positive view about the model's ease of use. In the same way 8.5% of the participants confirmed independence among the several modules within the model and that the model is applicable in the digital forensic evidence extraction process for mobile devices, and that it uses a simple language.

Comparison Analysis
A comparative analysis was performed between this developed metric and a model with existing models and metrics discussed in the literature. It was found that the current model exceeds the models discussed in the literature. Therefore, the proposed model is suitable for extracting digital forensic evidence in mobile devices managed by the four operating system platforms (Android, Windows, Apple iOS and Blackberry), as shown in Table 5. The Smartphone Forensic investigation model is close to the proposed model, except that it focuses more on the investigation than on extracting evidence which misses the phases of checking the status of the device and data retrieval, as highlighted by the proposed model as one of the main crucial issues in digital evidence extraction in mobile devices.

Reliability Testing
The Cronbach α value of the various constructs between 0.591 and 0.850 demonstrated the ability to measure the internal consistency of the constructs used in this study ensuring that none of the constructs fell below the medium-high confidence test. The predictive power of the regression model of this study, with adjusted R-squared 0.848, indicates an appropriate level of variance explained [28]. This implies that the independent variables and constructs used in this study are significant for understanding the causes of inconsistencies in the model of the digital evidence extraction process in mobile devices with different operating systems and platforms. For example, the study results showed that the extraction methods used during the extraction and analysis of evidence, such as whether the experimenter applies a logical, manual, physical or brute force approach when examining a device mobile, play an important role in ensuring consistency. Likewise, the forensic documentation process has emerged as an important contribution to ensuring the consistency of the processes followed during the extraction of evidence, requiring the documentation of certain stages or stages of the extraction process if the results are repeatable and defensible in court. This therefore justifies the choice of the constructs used in this study with the support of the literature and therefore the results of this study generate several questions that may be of interest to ICT professionals, researchers, law enforcement agencies, regulators. and industry to have a clear understanding of the factors causing inconsistencies in extracting digital forensic evidence on mobile devices [19], [32]- [34]. Once these factors are clearly understood, taking these factors into consideration when developing solutions for solution developers and paying attention to them during an investigation by forensic investigators or investigators would speed up the process of collecting, storing and submitting evidence to the courts. for law enforcement legal assistance.
whether the examiner applied a logical, manual, physical, or brute force approach during the process of examining a mobile device, will play a significant role in ensuring the issues of consistency. Similarly, the forensic documentation process came out as a key contributor to ensuring consistency in the processes followed during evidence extraction, whereby certain stages or phases in the extraction process ought to be documented if the results are to be repeatable and defensible in courts of law. This, therefore, justifies the choice of the constructs used in this study having support from the literature and therefore, the results of this study generate several issues that may be of interest to ICT practitioners, researchers, law enforcement authorities, Regulatory Authorities, and the business community to have a clear understanding of the factors that cause inconsistencies in digital forensics evidence extraction in mobile devices [19], [32]- [34]. Once these factors are clearly understood, factoring them during solution development for solution developers and paying attention to them during an investigation by forensic examiners or investigators would aid the process of collecting, preserving, and presenting evidence to courts of law for law enforcement agencies.

Descriptive Statistics for the Constructs
The descriptive statistics presented in Table 6 provide a clear picture of how these constructs rank based on mean responses, with PF coming out significantly with a mean response of 4.36, followed by FDP and FET with the lowest mean response. This means that if there is a clear policy regarding the handling, acquisition, storage, documentation and presentation of digital evidence, there should be minimal inconsistencies in the process model for extracting digital evidence from mobile devices. This is followed by the forensic documentation process, suggesting concordance with recent studies indicating a lack of clear technical documentation of existing mobile device process models and methods for extracting digital evidence [6]. Forensic extraction tools are the last of the eight constructs, this can be attributed to the fact that there are several digital evidence extraction tools and most investigators face challenges in choosing the right digital evidence extraction tool on mobile devices, depending on the mobile device platform they are on [20].

Policy Factor (PF)
The means and standard deviations of the aggregate measures for the seven items used to measure the PF construct are presented in Table 7. In this table, seven items are used to measure this construct, ranging from PF1 to PF7. Strong agreement was reached for the construct of the political factor with the mean score of (Mean therefore, the higher the average correlation between elements, the higher the construct's reliability coefficient, Cronbach's alpha (α), depending on keeping the number of elements constant [28]. Table 8 shows the correlation between items for items used to measure the policy factor (PF) constructs. Most items had acceptable correlation between items (r> = 0.2). The least agreed elements, i.e., the passing of laws governing mobile devices, the extraction of digital forensic evidence has a positive effect on inconsistencies in the extraction of evidence (PF6) and the development of strategies and frameworks for examining the digital forensic evidence for mobile devices has a positive effect. on the inconsistency of evidence extraction in mobile devices (PF7) was also the least correlated with the rest of the elements, while setting policies for extracting digital forensic evidence from mobile devices leads to a consistent process for retrieval of digital forensic evidence PF1, Creating digital forensic evidence Mobile evidence processing unit within the organization reduces inconsistencies in mobile devices Extraction of digital forensic evidence PF3 and recruitment of qualified personnel to manage mobile devices Digital forensic evidence has a positive effect on inconsistencies in the extraction of evidence PF4 was positively correlated with the rest of the items for the co-instructor. There was a moderate relationship (r> = 0.55) between the formulation of policy guidelines for extracting digital forensic evidence for mobile devices, which led to a consistent element for retrieving digital forensic evidence (PF1) and the establishment of a forensic evidence for the mobile device unit within the organization reduces inconsistencies in extracting digital forensic evidence from mobile devices (PF3) (r = 0.55), as well as a low correlation between the recruitment of qualified personnel to handle mobile devices digital forensic evidence has a positive effect on inconsistencies in evidence extraction (PF4) and in enacting laws for mobile devices, digital forensic evidence extraction has a positive effect on inconsistencies in evidence extraction (PF6) (r = 0.395). We can therefore conclude that the elements selected to measure the policy factor (PF) were suitable for the measure.

Device Factor
The average and standard deviations of the aggregate measures for the three items used to measure the DF construct are shown in Table 9.  Table 10 shows the inter-item correlation for the items used to measure the DF construct. As observed, most items had an acceptable inter-item correlation (r>=0.2). The least agreed item was mobile device type (DF2) and was least correlated with mobile device version (DF3) (r = 0.155). There was a moderate relationship (r>=0.568) between mobile device type (DF2) and device connection parameters (DF4) (r=0.331), and a weak correlation between mobile device status. mobile device during evidence collection (DF1) and (DF2) and (DF3) with (r > 0.279 but < 0.386). We can therefore conclude that the elements selected for the measurement of the DF were suitable for the measurement of this construct [28], [35].

Extraction Method Factor
The means and standard deviations of the aggregated measurements for the ten items used to measure the construction of the EMF. From Table  11, there is strong agreement for the factorial construction of the extraction method, with an average score of (Mean = 4.12, StdDev = 0.83) for the item Physical acquisition, l 'most commonly assumed item, EMF3 (mean=4.46, StdDev=0.716), followed by EMF1 (mean=4. 39  Similarly, in Table 12, the correlation between items for several factors and most of the items had acceptable inter-item correlation (r> = 0.2). The least agreed upon Architecture (EMF6), file system (EMF8), data storage mechanism (EMF9) and instant messaging applications (EMF10). Subsequently, they were less correlated with manual acquisition (EMF1), logical acquisition (EMF2) and physical acquisition (EMF3) with (r <= 0.2). There was a moderate relationship (r> = 0.589) between (EMF1) and EMF2, as well as a low correlation between (EMF3) and (EMF2) and (EMF10) with (r> 0.2 but <0.386). We can therefore conclude that the elements selected for the EMF measurement were suitable for the measured construct.

Nature of Data factors
The means and standard deviations of the aggregate measures for the five items used to measure the nature of data factors (ND) constructs are shown in Table 13.  There was a moderate relationship (r>=0.540) between internal and visible (ND1) and external and visible ND3 (r>=0.549), and a weak correlation between external but hidden (ND4) and external and visible (ND3) with (r<=0.295). We can therefore conclude that the items chosen to measure ND were appropriate for the measurement.   [4], [36]- [38] and then other factors such as policies [32], [39], [40], nature of data [41] and type of data [15], [42] have a small contribution to inconsistencies in the evidence extraction process. From this Table 16, two factors emerged in a very significant way, namely the factor of the extraction method which is at B = 1.030 and the device factor at B = 0.078; these positive values indicate that as independent variables increase the consistency metric, even a dependent variable increases it, this is supported by the literature [28]. The coefficient of determination also indicates that as some independent variables increase, the consistency decreases and the standard error decreases. For example, the nature of the data B = -0.029 and Beta = -0.037 with sig. to 0.443. The implication here is that these factors do not significantly contribute to the consistency metric and therefore have less impact on the consistency process model when extracting evidence on mobile devices with the four OS platforms used in this study. The results of this study showed that forensic extraction tools, extraction methods, nature of the data, type of device, and forensic documentation process are the main factors contributing to inconsistencies in extraction. These findings support the findings of recent studies that have revealed discrepancies in retrieving and reporting data residing on a device from previous tool tests and updates or new versions of the tool. This is in line with the results of the interviews, which showed that the type of data, the nature of the data and the method of extraction are a major cause of inconsistency in mobile device forensic evidence models. Furthermore, the study results established that the political factor is a benchmark for specifying a consistent model of digital forensic evidence extraction for mobile devices based on Android, Windows, iOS and Blackberry OS. In addition, the device factor is part of the metrics to specify a consistent model of digital forensic evidence extraction for mobile devices based on the four Operating systems (OSs).

Correlation of individual OS and Constructs
The present study showed that the extraction method factor is a metric for specifying a consistent digital forensic evidence extraction pattern for the four OS-based mobile devices. The results of the study revealed that the nature of data factors are measures to specify a consistent model of digital forensic evidence extraction for mobile devices based on the four OSs. This is convenient for Brian Cusack [43], who posits that the high-level process of digital forensics involves collecting data from a source, data analysis and evidence extraction, as well as the storage and presentation of evidence. This study found that forensic extraction tools are measures to specify a consistent pattern of digital forensic evidence extraction for mobile devices based on the four OSs. While the forensic documentation process is part of the measures to specify a consistent digital forensic evidence extraction model for mobile devices.

CONCLUSION
The extraction process model developed borrowed the principles of consistency, repeatability, and standardization as presented in earlier studies of the generalized forensic framework from previous studies. This model goes further to enumerate sequentially each step that should be followed in evidence extraction for each of the mobile operating systems, thereby ensuring that there are consistencies at every step of the extraction process. These sequential or chronological steps (stages) followed will yield positive results across the four mobile operating systems and it is believed that this model can act as a standard for any other mobile operating system platform that has not been part of this study, considering that the architecture of mobile devices does not differ significantly in terms of storage, processing, and application. The Smartphone Forensic investigation model is close to the proposed model except that it concentrates more on the investigation other than evidence extraction and critically lacks the device status check and data recovery phases, as pointed out in the proposed model as one of the key critical issues in digital evidence extraction in mobile devices. Future work should focus on practically testing these models and comparing the results for consistency across different operating system platforms.

Data Availability
Research data underlying the findings of the study can be accessed upon request from the corresponding author.