Commit 3b4a9634 authored by Devi Sribala Ganapathi's avatar Devi Sribala Ganapathi
Browse files

Implemented some todos, folder struct

parent 204757c9
This diff is collapsed.
# Quinone_Eutectics
# README for Paper: Large Decrease in Melting Point of Benzoquinones via High-n Eutectic Mixing Predicted by a Regular Solution Model
## Repository Structure
- Assumes that all of the data files are in a folder ../data/ which is organized into subfolders named with the YYYYMMDD date.
- Also assumes that there is another folder ../plots/ which has subfolders for different plots.
- Also assumes that there is another folder ../pub_out/
Virtual Environment Information:
Make sure you have conda installed on your computer and up to date
Following these instructions: https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/
To create a new virtual environment:
- In the terminal, type: conda create -n eutecticquinones python=3.7 anaconda
- You can replace eutecticquinones with whatever you want your virtual environment to be called, as long as it is a single word (no spaces)
- To activate virtual environnment, in the terminal, type: conda activate eutecticquinones
- You should see (eutecticquinones) in front of your command line input now
- When (if) in vscode, make sure you select the right interpreter by clicking on Python X.X.X in the bottom left corner, then selecting the (eutecticquinones) interpreter from the top menu. You can also navigate to this by pressing command+shift+p and typing interpreter -> select interpreter -> choose the one in the virtual environment. You might need to restart VSCode to get it to show up.
Python version: 3.7
This installation of python already includes all the necessary packages required for this code (see below for details). If any of the packages are not installed in your version of python, you can look up how to install them with conda
Python version: 3.7
- Packages that are used in this code:
- os
- platform
- csv
- ast
- re
- time
- sys
- itertools
- math
- copy
- pandas
- numpy
- matplotlib
- scipy
# Data Files:
- Included are the raw data files for the Q2-Q5 Binary system (at the 50-50 composition) and the Q2-Q5-Q6 Ternary system (at the eutectic composition calculated by both the ideal solution model and the regular solution model, and including the 50-50 binary mixtures of each "pair" in this sytem)
\ No newline at end of file
import DSC
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import gridspec
import ast
import csv
import sys
starttime= time.time()
# TODO: Remove the parent directory path so it's in the same folder
# TODO: Rename file so it's just analyze DSC
mainfolder = os.path.dirname(os.path.realpath(__file__))
datamainfolder = mainfolder+'/Data'
# plotfolder = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))+'/pub_out'
plotfolder = mainfolder + '/Plots'
analysisfilefolder = mainfolder + '/Analysis_Files'
date_str = str(time.localtime().tm_year)+str(time.localtime().tm_mon).zfill(2)+str(time.localtime().tm_mday).zfill(2)
##########################
##### NOTES
##########################
# Folders: assumes that the
# script and data are in separate folders in some main folder,
# i.e. the script is in "../mainfolder/script/"
# and the data is in "../mainfolder/data/20190521/"
# (where the date for each run is a separate folder in the data folder)
# and there is another folder for outputs, "../mainfolder/pub_out/"
# TODO: Include the csv with the data files that we're including in the repository
# TODO: Decide how to update below comment
# The input spreadsheet (spreadsheet_name) should be of the same form as the provided "DSC experiments - DSC_publish.csv"
# New lines can be added to this spreadsheet with new DSC data files
# To analyze new DSC data files, add the data files to the spreadsheet in the same way as the existing columns
##########################
##### INPUTS
##########################
# Input type: either put 'single' for analysis of a single run,
# or 'loop' for analysis of all the runs in the spreadsheet
input_type = 'single'
# Fill in the filename if looking at a single input
the_filename='20200903_Q2-Q4-Q5_1-1-1.txt'
# If you want to play around with the fit parameters here, set this to True
# (Only matters for input_type = single)
# Otherwise, if False, the fit parameters in the spreadsheet will be used, and if there is nothing
# there, then the default will be 1 for the fit_nums and 0.001 for s
use_these_fit_parameters = False
fit_num_left = 8
fit_num_right =1
s = 0.001
# Input list
#### If you want to run all of the experiments in the DSC experiments - DSC_publish.csv, comment out the below line uncomment the "filenamelist = 'all'" line. ####
#### If you only want to run a select few experiments, write them in a txt file and include that file in the main folder of this repository (same level this scipt)
#### This file goes in the Analysis_Files subfolder
# filenamelist = 'list_publish_select.txt'
filenamelist = 'all'
# Input spreadsheet - this is the spreadsheet that contains all the information about the DSC experiments. An example file is included
spreadsheet_name = 'DSC experiments - DSC_publish.csv'
# Output spreadsheet - this stores all the results from this analysis so it can be used to calculate interaction parameters with the other scripts.
# It will be saved in the Analysis_Files folder
output_spreadsheet = 'DSC_publish_results_'+date_str+'.csv'
##########################
##### Data import and analysis
##########################
spreadsheet = DSC.import_spreadsheet(mainfolder+'/',spreadsheet_name)
if input_type == 'loop':
with open(analysisfilefolder +'/'+output_spreadsheet,'w') as f:
writer = csv.writer(f)
writer.writerow(['Title','Filename','Process','Peaks','Eutectic_peak','mp_C','Hm','H_eutectic'])
if input_type == 'single':
filenames = [the_filename]
elif input_type == 'loop':
if filenamelist == 'all':
filenames = [row.Filename for i,row in spreadsheet.iterrows()]
else:
with open(analysisfilefolder+'/'+filenamelist) as f:
filenames = f.read().splitlines()
# sys.exit('Stopped in program')
# TODO: Need to include a copy of the spreadsheet with the appropriate columns in the github repository, because this code assumes a certain structure
# TODO: Add example DSC files in an example DSC folder? I think it would actually be easier to change the repository structure
# so the data files are a subfolder in the main folder. That way it's easier for users to see examples
for filename in filenames:
# try:
plt.close('all')
# Import the relevant data from the input spreadsheet
filefolder=str(int(spreadsheet[spreadsheet.Filename==filename].DateFolder.values[0]))
data = DSC.import_DSC(datamainfolder+'/'+filefolder,filename)
baseline_fitting_points = ast.literal_eval(spreadsheet[spreadsheet.Filename==filename].baseline_fitting_points.values[0])
approx_eutectic_temp = float(spreadsheet[spreadsheet.Filename==filename].approx_eutectic_temp.values[0])
enthalpy_fitting_points = ast.literal_eval(spreadsheet[spreadsheet.Filename==filename].enthalpy_fitting_points.values[0])
process = spreadsheet[spreadsheet.Filename==filename].Process.values[0]
moles = spreadsheet[spreadsheet.Filename==filename].Sample_mg.values[0]/1000./spreadsheet[spreadsheet.Filename==filename].MW_avg.values[0]
ramp = spreadsheet[spreadsheet.Filename==filename].Ramp.values[0]
plot_eutectic_integration = True if spreadsheet[spreadsheet.Filename==filename].integrate_eutectic.values[0] == 'Y' else False
plot_title = spreadsheet[spreadsheet.Filename==filename].plot_title.values[0]
# find peaks
peaks = DSC.find_peaks(data,num=process)
# pick the closest peak to the "approximate eutectic" as the eutectic
eutectic_peak = min(peaks,key=lambda x:abs(x-approx_eutectic_temp))
# prepare the fit_nums
if input_type == 'single':
if not use_these_fit_parameters:
fit_num_left = spreadsheet[spreadsheet.Filename==filename].fit_num_left.values[0]
fit_num_right = spreadsheet[spreadsheet.Filename==filename].fit_num_right.values[0]
fit_nums = {'left':fit_num_left if fit_num_left==fit_num_left else 1,'right':fit_num_right if fit_num_right==fit_num_right else 1}
s1 = spreadsheet[spreadsheet.Filename==filename].s.values[0]
s = s1 if s1==s1 else 0.001
else:
fit_nums = {'left':fit_num_left,'right':fit_num_right}
elif input_type == 'loop':
fit_num_left = spreadsheet[spreadsheet.Filename==filename].fit_num_left.values[0]
fit_num_right = spreadsheet[spreadsheet.Filename==filename].fit_num_right.values[0]
s = spreadsheet[spreadsheet.Filename==filename].s.values[0]
fit_nums = {'left':fit_num_left if fit_num_left==fit_num_left else 1,'right':fit_num_right if fit_num_right==fit_num_right else 1}
# fits at the 2nd derivative going to 0
lin_fits,lin_fit_temps = DSC.linear_fit_peak_sides(data,num=process,peaks=[eutectic_peak],fit_nums=fit_nums,s=s if s==s else 0.001,return_temps=True)
# baseline
x = data[(data.Process==process)&((data.Temp>enthalpy_fitting_points[0])&(data.Temp<enthalpy_fitting_points[1]))].Temp
baseline = DSC.baseline_vanderPlaats(data,x,num=process,locations=baseline_fitting_points,Tb=enthalpy_fitting_points[0],Te=enthalpy_fitting_points[1])
# mp
mp = DSC.mp_from_baseline_and_fit(x,baseline,lin_fits[0][0])
# enthalpy
enthalpy_vdp = DSC.calc_enthalpy_vanderPlaats(data,(enthalpy_fitting_points[0],enthalpy_fitting_points[1]),moles,ramp,num=process,locations=baseline_fitting_points,baseline_values=baseline)
if plot_eutectic_integration:
baseline_props = {'locations':baseline_fitting_points,'Tb':enthalpy_fitting_points[0],'Te':enthalpy_fitting_points[1]}
eutectic_enthalpy,eutectic_enthalpy_pts = DSC.calc_enthalpy_peak(data,enthalpy_fitting_points[0],lin_fits,lin_fit_temps,baseline_props,ramp,num=process,return_pts=True)
else:
eutectic_enthalpy = np.nan
print('====== File '+filename+' ======')
print('Process = '+str(process))
print('Peaks found at: '+str(peaks))
print('Eutectic peak located at: '+str(eutectic_peak)+' C (approx. eutectic was '+str(approx_eutectic_temp)+' C)')
print('Melting point calculated: '+str(mp)+' C')
print('enthalpy of melting = '+str(enthalpy_vdp)+' J/mol')
if plot_eutectic_integration:
print('enthalpy of eutectic peak = '+str(eutectic_enthalpy)+' J')
print('Time to Run = ' + str(time.time()-starttime) + ' s')
if input_type == 'loop':
with open(analysisfilefolder+'/'+output_spreadsheet, 'a') as f:
writer = csv.writer(f)
writer.writerow([plot_title,filename,str(process),str(peaks),str(eutectic_peak),str(mp),str(enthalpy_vdp),str(eutectic_enthalpy)])
# except:
# print('******************************')
# print('Error occurred for '+filename)
# if input_type == 'loop':
# with open(plotfolder+'/'+output_spreadsheet, 'a') as f:
# writer = csv.writer(f)
# writer.writerow(['ERROR',filename,'','','','','',''])
try:
##########################
##### Data plotting
##########################
plt.figure(figsize=(6.25,2.9), dpi=100)
gs = gridspec.GridSpec(1, 3, wspace=0.15)
ax1 = plt.subplot(gs[0])
# Title
if plot_title == plot_title:
plt.title(plot_title,loc='left',fontsize=12)
ax2 = plt.subplot(gs[1], sharey=ax1, sharex=ax1)
ax2.get_yaxis().set_visible(False)
axes_to_plot = [ax1,ax2]
if plot_eutectic_integration:
ax3 = plt.subplot(gs[2], sharey=ax1, sharex=ax1)
ax3.get_yaxis().set_visible(False)
axes_to_plot = [ax1,ax2,ax3]
for ax in axes_to_plot:
# plot data
ax.plot(data[data.Process==process].Temp,data[data.Process==process].Heat/moles/1000.,linewidth=1,c=(49/255.,91/255.,138/255.))
# plot baseline
ax.plot(x,baseline/moles/1000.,'--',linewidth=1,c=(200/255.,91/255.,108/255.))
# plot mp fits on ax1
x_for_mp_fit = data[(data.Process==process)&((data.Temp>mp)&(data.Temp<lin_fit_temps[0][0]))]
ax1.plot(x_for_mp_fit,np.polyval(lin_fits[0][0],x_for_mp_fit)/moles/1000.,'--',c=(243/255.,117/255.,105/255.))
ax1.scatter(mp,np.polyval(lin_fits[0][0],mp)/moles/1000.,s=10,c='k')
ax1.scatter(lin_fit_temps[0][0],np.polyval(lin_fits[0][0],lin_fit_temps[0][0])/moles/1000.,s=10,c='k')
# plot integration on ax2
ax2.fill_between(x,baseline/moles/1000.,data[(data.Process==process)&((data.Temp>enthalpy_fitting_points[0])&(data.Temp<enthalpy_fitting_points[1]))].Heat/moles/1000.,color=(83/255.,143/255.,204/255.))
# plot eutectic integration on ax3
if plot_eutectic_integration:
ax3.fill_between(eutectic_enthalpy_pts[0],eutectic_enthalpy_pts[1]/moles/1000.,eutectic_enthalpy_pts[2]/moles/1000.,color=(83/255.,143/255.,204/255.))
### Ranges
range_temp = spreadsheet[spreadsheet.Filename==filename].plot_range_temp.values[0]
range_heatnormalized = spreadsheet[spreadsheet.Filename==filename].plot_range_heatnormalized.values[0]
## use the fact that nan != nan
if range_temp == range_temp:
ax1.set_xlim(ast.literal_eval(range_temp))
# TODO: add an else statment with an error message saying that the provided temperature ranges are invalid?
if range_heatnormalized == range_heatnormalized:
ax1.set_ylim(ast.literal_eval(range_heatnormalized))
### Labels
ax1.set_xlabel(r"Temp ($^{\circ}$C)")
ax1.set_ylabel("Heat Flow (W/mole)\n(endotherm down)")
for theaxis in axes_to_plot:
for item in ([theaxis.xaxis.label, theaxis.yaxis.label, theaxis.yaxis.get_offset_text(), theaxis.xaxis.get_offset_text()]):
item.set_fontsize(12)
for item in (theaxis.get_xticklabels() + theaxis.get_yticklabels()):
item.set_fontsize(10)
plt.gcf().subplots_adjust(left=0.16,bottom=0.17,top=0.91)
plt.savefig(plotfolder+'/'+filename[0:-4]+'_proc'+str(process)+'.png',format='png',dpi=300)# ,transparent=True)
if input_type == 'single':
plt.show()
except:
print('******************************')
print('Error occurred during plotting '+filename)
print('Time to Run = ' + str(time.time()-starttime) + ' s')
\ No newline at end of file
# TODO: Delete the below comments?
# New made on 7/16/20
# Original file seems to be 20180928 immiscible model fit.py
import os
import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fsolve
##########################
##### NOTES
##########################
# This code is used to calculate the interaction parameter, L, for a binary quinone (or hydroquinone) mixture
# TODO: Include more information on what is in the spreadsheets called in analysis type 2, and where these should be?
# TODO: The files are also the outputs from the analyze_dsc_for_pub.py
starttime= time.time()
folder = os.path.dirname(os.path.realpath(__file__))
analysisfilefolder = folder + '/Analysis_Files'
##########################
##### INPUTS
##########################
# Type of analysis
# 1 = single, or
# 2 = loop from spreadsheets
analysis_type = 1
# physical constants
R = 8.314 # J/mol/K
# Inputs for analysis type 1
Aname = 'Q6'
Hmelt_A = -18611.894530000001 # J/mol
Tmelt_A = 393.84296729999994
Bname = 'Q22'
Hmelt_B = -17447.37977
Tmelt_B = 317.46808540999996
Te = 310.44587120999995
# Initial guesses for analysis type 1
Lguess = 2000 # J/mol
xeguess = 0.8
# immiscible solids model
# TODO: Can we change p to a more intuitive parameter name? I think it's initial guesses but I'm not sure
def equations(p, *data):
L, xe = p
Tm_A, Hm_A, Tm_B, Hm_B, Te = data
eq1 = Hm_B*(Te/Tm_B - 1) + R*Te*np.log(xe) + L*(1-xe)**2
eq2 = Hm_A*(Te/Tm_A - 1) + R*Te*np.log(1-xe) + L*(xe)**2
return eq1, eq2
if analysis_type == 1:
data = (Tmelt_A,Hmelt_A,Tmelt_B,Hmelt_B,Te)
result, infodict, ier, mesg = fsolve(equations, (Lguess,xeguess), args=data,full_output=True)
print(mesg)
print('L: '+str(result[0]))
print('xe: '+str(result[1]))
elif analysis_type == 2:
# assumes this single file has columns mp_C, Hfus
singledf = pd.read_csv(analysisfilefolder + '/quinones_OLD.csv')
# assumes this binary files has columns T_e
binarydf = pd.read_csv(analysisfilefolder + '/binary_inputs_OLD.csv')
binarydf['L_model'] = ''
binarydf['xe_model'] = ''
Lguess = 2000 # J/mol
xeguess = 0.5
for i,row in binarydf.iterrows():
Te = row.T_e + 273.15
Tmelt_A = singledf[singledf.comp==row.comp1].mp_C.values[0] + 273.15
Hmelt_A = -singledf[singledf.comp==row.comp1].Hfus.values[0]
Tmelt_B = singledf[singledf.comp==row.comp2].mp_C.values[0] + 273.15
Hmelt_B = -singledf[singledf.comp==row.comp2].Hfus.values[0]
if row.comp1 == 'Q6' and row.comp2 == 'Q22':
xeguess = 0.9
data = (Tmelt_A,Hmelt_A,Tmelt_B,Hmelt_B,Te)
result, infodict, ier, mesg = fsolve(equations, (Lguess,xeguess), args=data,full_output=True)
binarydf.at[i, 'L_model'] = result[0]
binarydf.at[i, 'xe_model'] = result[1]
binarydf.at[i, 'mesg'] = mesg
binarydf.to_csv(analysisfilefolder+'/L_out.csv')
\ No newline at end of file
import os
import sys
import copy
import numpy as np
import scipy.optimize as sciop
import math
import pandas as pd
from itertools import combinations
##########################
##### NOTES
##########################
# This code is used to calculate the approximate eutectic composition and temperature for an n-component mixture
# TODO: Add more notes about how to use this code
# TODO: It looks like we need another subfolder, "input"
folder = os.path.dirname(os.path.realpath(__file__))
analysisfilefolder = folder + '/Analysis_Files'
# physical constants
R = 8.314 # J/mol/K
kb = 0.008314 #kJ/molK
#class used as input to solver:
class equil:
def __init__(self, tempList, enthList, mixCompList):
#Initilization sets up parameters for calculation
#Length of list = number of components in system
#ordering of tempList and enthList must be consistent
self.tempList = copy.copy(tempList) #List of melting temperatures for components being calculated
self.enthList = copy.copy(enthList) #List of enthalpies for components being calculated
self.mixCompList = copy.copy(mixCompList) #List of components (entries in the overall component list which is sorted by temperature)
return
def __call__(self, x):
#What the solver calls
#x[0] to x[-2]: mol fractions of components
#x[-1]: temperature
#Initializing output data
#outData[0] to outData[-2] equations: chemical potentials between solid and liquid phase must be equal
#outData[-1] equation: all mole fractions add up to 1.0
outData = np.zeros( len(x) )
#Looping through mole fractions
for counti, i in enumerate(self.mixCompList):
#Calculating activity coefficient using mole fractions and interaction parameter
lnGam = 0.0
for countj, j in enumerate(self.mixCompList):
if j != i:
lnGam += AData[i][j]*x[countj]*(1.0-x[counti])
for countk, k in enumerate(self.mixCompList):
if (j != k) and (i != k):
lnGam += -0.5*AData[j][k]*x[countj]*x[countk]
#Calculating function output
outData[counti] = math.exp( -self.enthList[i]/kb*(1.0/x[-1] - 1.0/self.tempList[i]) ) - x[counti]*math.exp(lnGam/(kb*x[-1]*1000.0))
outData[-1] += x[counti]
outData[-1] += -1
return outData
##########################
##### INPUTS
##########################
# read temp and enthalpy values from .csv files
# file for single components (melting point and enthalpy of fusion)
# TODO: Make single input-output folder (this input is the output from the analyze_dsc_for_pub.py)
singledf = pd.read_csv(folder + '/Analysis_Files/single_components_paper_20210325.csv')
# file for interaction parameters
binarydf = pd.read_csv(folder + '/Analysis_Files/L_out_20210325.csv')
# components to find eutectics
comps_of_interest = ['Q3','Q6','Q12','Q22']
# orders of eutectic to find (number of components)
# TODO: Why are ns 3, but it looks like there are 4 components above? Is it a 0-indexing thing? If so we should add a comment about that
ns = [3]
#output file
outfile_path = folder+'/Analysis_Files/eutectic_out_20210410_2.csv'
##########################
##### CALCULATIONS
##########################
singledf['mp_K'] = singledf['mp_C']+273.15
# get lists of the temperatures, enthalpies, and component names, all sorted by mp
tList = singledf[singledf.comp.isin(comps_of_interest)].sort_values('mp_K')['mp_K'].tolist()
hList = [Hfus/(1000.) for Hfus in singledf[singledf.comp.isin(comps_of_interest)].sort_values('mp_K')['Hfus'].tolist()]
compList = singledf[singledf.comp.isin(comps_of_interest)].sort_values('mp_K')['comp'].tolist()
# combinations to test
# batchList is a list of, for each n, the mixtures of that n
batchList = [[]]
for n in ns:
if n > len(comps_of_interest):
sys.exit('Error: n cannot be greater than the number of components of interest')
batchList.append([foo for foo in combinations(range(len(compList)),n)])
# Read in interaction energies into a symmetric matrix, ordered by melting point
AData = []
for comp1 in compList:
interaction_params = []
for comp2 in compList:
# interaction_params.append(0) #uncomment/comment for ideal solution
# TODO: above line is the only line needed for an ideal solution. Add variable at top of script selecting whether to run ideal or regular solution model calculation
if comp1==comp2:
interaction_params.append(0)
else:
# note the order of the components in the file is random, so which one is comp1 or comp2 has to be checked
if len(binarydf[(binarydf.comp1==comp1)&(binarydf.comp2==comp2)]) > 0:
interaction_params.append(binarydf[(binarydf.comp1==comp1)&(binarydf.comp2==comp2)].L_model.values[0])
else:
interaction_params.append(binarydf[(binarydf.comp2==comp1)&(binarydf.comp1==comp2)].L_model.values[0])
AData.append(interaction_params)
eutectic_lst = []
#Looping through every composition
for i in range(len(batchList)):
#On first loop, use melting temps for 1 component systems
if i == 0:
for t,c in zip(tList,compList):
eutectic_lst.append({'n':1,'T_e':t-273.15,'components':c})
else:
#Initializing mol fraction and eutectic temperatures
xLen = len(batchList[i][0])
xData = np.zeros( xLen+1 )
for j in range(len(batchList[i])):
xData[j] = 1.0/float(xLen)
xData[-1] = min(tList[x] for x in batchList[i][j])
#Solving
newFunc = equil(tList, hList, batchList[i][j])
result, infodict, ier, mesg = sciop.fsolve(newFunc, xData, full_output=True)
newentry = {'n':xLen,'T_e':result[-1]-273.15,'components':','.join([compList[q] for q in batchList[i][j]]),'composition':list(result[0:-1]),'infodict':infodict,'ier':ier,'mesg':mesg }
eutectic_lst.append(newentry)
newFunc = []
eutectics = pd.DataFrame(eutectic_lst)
eutectics = eutectics.sort_values(by=['T_e'])
eutectics.to_csv(outfile_path)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment