ParetoNBDModel#
- class pymc_marketing.clv.models.pareto_nbd.ParetoNBDModel(data, *, model_config=None, sampler_config=None)[source]#
- Pareto Negative Binomial Model (Pareto/NBD). - Model for continuous, non-contractual customers, first introduced by Schmittlein et al. [1], with additional derivations and predictive methods by Hardie & Fader [2] [3] [4] [5]. - The Pareto/NBD model assumes the time duration a customer is active follows a Gamma distribution, and time between purchases is also Gamma-distributed while the customer is still active. - This model requires data to be summarized by recency, frequency, and T for each customer, using - clv.rfm_summary()or equivalent. Covariates impacting customer dropouts and transaction rates are optional.- Parameters:
- dataDataFrame
- DataFrame containing the following columns: - customer_id: Unique customer identifier
- frequency: Number of repeat purchases
- recency: Time between the first and the last purchase
- T: Time between the first purchase and the end of the observation period. Model assumptions require T >= recency
 - Along with optional covariate columns. 
- model_configdict, optional
- Dictionary containing model parameters and covariate column names: - r: Shape parameter of time between purchases; defaults to- Weibull(alpha=2, beta=1)
- alpha: Scale parameter of time between purchases; defaults to- Weibull(alpha=2, beta=10)
- s: Shape parameter of time until dropout; defaults to- Weibull(alpha=2, beta=1)
- beta: Scale parameter of time until dropout; defaults to- Weibull(alpha=2, beta=10)
- purchase_covariates: Coefficients for purchase rate covariates; defaults to- Normal(0, 3)
- dropout_covariates: Coefficients for dropout covariates; defaults to- Normal.dist(0, 3)
- purchase_covariate_cols: List containing column names of covariates for customer purchase rates.
- dropout_covariate_cols: List containing column names of covariates for customer dropouts.
 - If not provided, the model will use default priors specified in the - default_model_configclass attribute.
- sampler_configdict, optional
- Dictionary of sampler parameters. Defaults to None. 
 
- data
 - References [1]- David C. Schmittlein, Donald G. Morrison and Richard Colombo. “Counting Your Customers: Who Are They and What Will They Do Next”. Management Science,Vol. 33, No. 1 (Jan., 1987), pp. 1-24. [2]- Fader, Peter & G. S. Hardie, Bruce (2005). “A Note on Deriving the Pareto/NBD Model and Related Expressions”. http://brucehardie.com/notes/009/pareto_nbd_derivations_2005-11-05.pdf [3]- Fader, Peter & G. S. Hardie, Bruce (2014). “Additional Results for the Pareto/NBD Model”. https://www.brucehardie.com/notes/015/additional_pareto_nbd_results.pdf [4]- Fader, Peter & G. S. Hardie, Bruce (2014). “Deriving the Conditional PMF of the Pareto/NBD Model”. https://www.brucehardie.com/notes/028/pareto_nbd_conditional_pmf.pdf [5]- Fader, Peter & G. S. Hardie, Bruce (2007). “Incorporating Time-Invariant Covariates into the Pareto/NBD and BG/NBD Models”. https://www.brucehardie.com/notes/019/time_invariant_covariates.pdf - Examples - import pymc as pm from pymc_extras.prior import Prior from pymc_marketing.clv import ParetoNBDModel, rfm_summary rfm_df = rfm_summary(raw_data,'id_col_name','date_col_name') # Initialize model with customer data; `model_config` parameter is optional model = ParetoNBDModel( data=rfm_df, model_config={ "r": Prior("Weibull", alpha=2, beta=1), "alpha: Prior("Weibull", alpha=2, beta=10), "s": Prior("Weibull", alpha=2, beta=1), "beta": Prior("Weibull", alpha=2, beta=10), }, ) # Fit model quickly to large datasets via the default Maximum a Posteriori method model.fit(method='map') print(model.fit_summary()) # Use 'demz' for more informative predictions and reliable performance on smaller datasets model.fit(method='demz') print(model.fit_summary()) # Predict number of purchases for customers over the next 10 time periods expected_purchases = model.expected_purchases( data=rfm_df, future_t=10, ) # Predict probability of customer making 'n' purchases over 't' time periods # Data parameter is omitted here because predictions are ran on original dataset expected_num_purchases = model.expected_purchase_probability( n=[0, 1, 2, 3], future_t=[10,20,30,40], ) new_data = pd.DataFrame( data = { "customer_id": [0, 1, 2, 3], "frequency": [5, 2, 1, 8], "recency": [7, 4, 2.5, 11], "T": [10, 8, 10, 22] } ) # Predict probability customers will still be active in 'future_t' time periods probability_alive = model.expected_probability_alive( data=new_data, future_t=[0, 3, 6, 9], ) # Predict number of purchases for a new customer over 't' time periods. expected_purchases_new_customer = model.expected_purchases_new_customer( t=[2, 5, 7, 10], ) - Methods - ParetoNBDModel.__init__(data, *[, ...])- Initialize model configuration and sampler configuration for the model. - Convert the model configuration and sampler configuration from the attributes to keyword arguments. - Build the model from the InferenceData object. - Build the model. - Create attributes for the inference data. - Compute posterior predictive samples of dropout, purchase rate and frequency/recency of new customers. - Sample from the Gamma distribution representing dropout times for new customers. - ParetoNBDModel.distribution_new_customer_purchase_rate([...])- Sample from the Gamma distribution representing purchase rates for new customers. - ParetoNBDModel.distribution_new_customer_recency_frequency([...])- Pareto/NBD process representing purchases across the customer population. - Compute expected probability of being alive. - Compute expected probability of n_purchases over future_t time periods. - ParetoNBDModel.expected_purchases([data, ...])- Compute expected number of future purchases. - Compute the expected number of purchases for a new customer across t time periods. - ParetoNBDModel.fit([method, fit_method])- Infer posteriors of model parameters to run predictions. - ParetoNBDModel.fit_summary(**kwargs)- Compute the summary of the fit result. - ParetoNBDModel.graphviz(**kwargs)- Get the graphviz representation of the model. - Create the initialization kwargs from an InferenceData object. - ParetoNBDModel.load(fname[, check])- Create a ModelBuilder instance from a file. - ParetoNBDModel.load_from_idata(idata[, check])- Create a ModelBuilder instance from an InferenceData object. - ParetoNBDModel.save(fname, **kwargs)- Save the model's inference data to a file. - ParetoNBDModel.set_idata_attrs([idata])- Set attributes on an InferenceData object. - ParetoNBDModel.table(**model_table_kwargs)- Get the summary table of the model. - ParetoNBDModel.thin_fit_result(keep_every)- Return a copy of the model with a thinned fit result. - Attributes - default_model_config- Default model configuration. - default_sampler_config- Default sampler configuration. - fit_result- Get the posterior fit_result. - id- Generate a unique hash value for the model. - posterior- posterior_predictive- predictions- prior- prior_predictive- version- idata- sampler_config- model_config
